|
32047
|
1241
|
51
|
2026-05-13T09:04:04.620988+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778663044620_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"New Tab","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.014960106,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.18994413,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.3463687,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.3575419,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.38068634,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.05817819,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"bounds":{"left":0.17885639,"top":0.0,"width":0.0034906915,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21010639,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.011303191,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.120678194,"top":0.0,"width":0.008144947,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"bounds":{"left":0.14960106,"top":0.0,"width":0.020944148,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21542554,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"bounds":{"left":0.09142287,"top":0.029928172,"width":0.028922873,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"bounds":{"left":0.12034574,"top":0.029928172,"width":0.106715426,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"bounds":{"left":0.09142287,"top":0.029928172,"width":0.21991356,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"bounds":{"left":0.09142287,"top":0.050678372,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"bounds":{"left":0.0787899,"top":0.13288109,"width":0.22456782,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"bounds":{"left":0.0787899,"top":0.18715084,"width":0.036402926,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"bounds":{"left":0.091755316,"top":0.19592977,"width":0.017785905,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.14029256,"top":0.292498,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"bounds":{"left":0.16023937,"top":0.30207503,"width":0.13696809,"height":0.09577015},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.3028731,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"bounds":{"left":0.16023937,"top":0.30367118,"width":0.13663563,"height":0.13128492},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"bounds":{"left":0.29720744,"top":0.30207503,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.42976856,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.43216282,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.4736632,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.47565842,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"bounds":{"left":0.0787899,"top":0.47685555,"width":0.025930852,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.106715426,"top":0.47805268,"width":0.011136968,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"bounds":{"left":0.0787899,"top":0.47685555,"width":0.23005319,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"bounds":{"left":0.0787899,"top":0.5271349,"width":0.1341423,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"bounds":{"left":0.21492687,"top":0.528332,"width":0.053025264,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"bounds":{"left":0.0787899,"top":0.5271349,"width":0.22623006,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"bounds":{"left":0.0787899,"top":0.5885874,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"bounds":{"left":0.0787899,"top":0.59018356,"width":0.08194814,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"bounds":{"left":0.0787899,"top":0.61652035,"width":0.20761304,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"bounds":{"left":0.24883644,"top":0.63846767,"width":0.025099734,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"bounds":{"left":0.0787899,"top":0.63727057,"width":0.22240691,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"bounds":{"left":0.105053194,"top":0.65802073,"width":0.11419548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.21924867,"top":0.65802073,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"bounds":{"left":0.0787899,"top":0.6875499,"width":0.22755983,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.27293882,"top":0.7094972,"width":0.011136968,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"bounds":{"left":0.0787899,"top":0.70830005,"width":0.23354389,"height":0.07861133},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"bounds":{"left":0.0787899,"top":0.811253,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"bounds":{"left":0.0787899,"top":0.81284916,"width":0.09840426,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"bounds":{"left":0.0787899,"top":0.83918595,"width":0.23254654,"height":0.07861133},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.117519945,"top":0.90263367,"width":0.011136968,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"bounds":{"left":0.0787899,"top":0.90143657,"width":0.23038563,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"bounds":{"left":0.0787899,"top":0.98363924,"width":0.234375,"height":0.01636076},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"bounds":{"left":0.0787899,"top":0.98523545,"width":0.10405585,"height":0.014764547},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"bounds":{"left":0.0787899,"top":1.0,"width":0.2209109,"height":-0.011572242},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.14660904,"top":1.0,"width":0.011303191,"height":-0.05426979},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"bounds":{"left":0.0787899,"top":1.0,"width":0.23121676,"height":-0.05307257},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.11702128,"top":0.92178774,"width":0.11170213,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-4408640441979658528
|
9209072993111256029
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people....
|
32044
|
NULL
|
NULL
|
NULL
|
|
38981
|
1441
|
23
|
2026-05-14T06:32:19.443344+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740339443_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
4192274737029184738
|
9209072041784404957
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options...
|
38979
|
NULL
|
NULL
|
NULL
|
|
39146
|
1443
|
10
|
2026-05-14T06:37:33.953332+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740653953_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
12
/
05
/
2026
Calendar
TOTAL SPAN
12h 25m
09:21 → 21:46
ACTIVE TIME
(WALL CLOCK)
9h 20m
BREAKS
2 · 3h 5m
SESSIONS — CLICK TO FILTER
S1: 09:21–16:08 (6h 47m)
S2: 16:33–17:41 (1h 8m)
S3: 20:21–21:46 (1h 25m)
Click a session segment to filter activity to that time window
S1 6h 47m
S2 1h 8m
S3 1h 25m
160m
09:21
11:25
13:29
15:33
17:37
19:42
21:46
FRAMES
7274
APPS
16
UI EVENTS
7044
AUDIO
319
ACTIVE PERIOD
(TIMES IN LOCAL TIMEZONE)
09:21 → 21:46
TIME PER APP
— CLICK TO FILTER ALL PANELS BY APP
Firefox
4h 36m
Slack
1h 55m
iTerm2
1h 17m
PhpStorm
51m
Claude
31m
Windsurf
19m
Code...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Screenpipe [archive.db · 3234.2MB]","depth":7,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Screenpipe","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"[archive.db · 3234.2MB]","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Activity","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":true,"is_selected":false},{"role":"AXButton","text":"Search","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Audio","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Work Report","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Timetable","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Summary","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Date","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"05","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Calendar","depth":8,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"TOTAL SPAN","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12h 25m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"09:21 → 21:46","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ACTIVE TIME","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(WALL CLOCK)","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"9h 20m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"BREAKS","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2 · 3h 5m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SESSIONS — CLICK TO FILTER","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S1: 09:21–16:08 (6h 47m)","depth":12,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S2: 16:33–17:41 (1h 8m)","depth":12,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S3: 20:21–21:46 (1h 25m)","depth":12,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Click a session segment to filter activity to that time window","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S1 6h 47m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S2 1h 8m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S3 1h 25m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"160m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"09:21","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"11:25","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"13:29","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"15:33","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"17:37","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"19:42","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"21:46","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"FRAMES","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"7274","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"APPS","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"16","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"UI EVENTS","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"7044","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AUDIO","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"319","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ACTIVE PERIOD","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(TIMES IN LOCAL TIMEZONE)","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"09:21 → 21:46","depth":10,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TIME PER APP","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"— CLICK TO FILTER ALL PANELS BY APP","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Firefox","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4h 36m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Slack","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1h 55m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"iTerm2","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1h 17m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"PhpStorm","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"51m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Claude","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"31m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Windsurf","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"19m","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Code","depth":11,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
1220148601518529645
|
9209072041777048541
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
12
/
05
/
2026
Calendar
TOTAL SPAN
12h 25m
09:21 → 21:46
ACTIVE TIME
(WALL CLOCK)
9h 20m
BREAKS
2 · 3h 5m
SESSIONS — CLICK TO FILTER
S1: 09:21–16:08 (6h 47m)
S2: 16:33–17:41 (1h 8m)
S3: 20:21–21:46 (1h 25m)
Click a session segment to filter activity to that time window
S1 6h 47m
S2 1h 8m
S3 1h 25m
160m
09:21
11:25
13:29
15:33
17:37
19:42
21:46
FRAMES
7274
APPS
16
UI EVENTS
7044
AUDIO
319
ACTIVE PERIOD
(TIMES IN LOCAL TIMEZONE)
09:21 → 21:46
TIME PER APP
— CLICK TO FILTER ALL PANELS BY APP
Firefox
4h 36m
Slack
1h 55m
iTerm2
1h 17m
PhpStorm
51m
Claude
31m
Windsurf
19m
Code...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39284
|
1443
|
71
|
2026-05-14T06:41:38.341884+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740898341_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
12
/
05
/
2026
Calendar
TOTAL SPAN
12h 25m
09:21 → 21:46
ACTIVE TIME
(WALL CLOCK)
9h 20m
BREAKS
2 · 3h 5m
SESSIONS — CLICK TO FILTER
S1: 09:21–16:08 (6h 47m)
S2: 16:33–17:41 (1h 8m)
S3: 20:21–21:46 (1h 25m)
Click a session segment to filter activity to that time window
S1 6h 47m
S2 1h 8m
S3 1h 25m
160m
09:21
11:25
13:29
15:33
17:37
19:42
21:46
FRAMES
7274
APPS
16
UI EVENTS
7044
AUDIO
319
ACTIVE PERIOD
(TIMES IN LOCAL TIMEZONE)
09:21 → 21:46
TIME PER APP
— CLICK TO FILTER ALL PANELS BY APP
Firefox
4h 36m
Slack
1h 55m
iTerm2
1h 17m
PhpStorm
51m
Claude
31m
Windsurf
19m
Code
10m
QuickTime Player
9m
Finder...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.5,"top":0.072222225,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.5277778,"top":0.08777778,"width":0.079166666,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.5,"top":0.11777778,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.5277778,"top":0.13333334,"width":0.061805554,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.5,"top":0.16333333,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.5277778,"top":0.17888889,"width":0.077083334,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.5,"top":0.20888889,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.5277778,"top":0.22444445,"width":0.079166666,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.5715278,"top":0.2188889,"width":0.016666668,"height":0.026666667},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.5,"top":0.25444445,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.5277778,"top":0.27,"width":0.08506945,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.5,"top":0.3,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.5277778,"top":0.31555554,"width":0.07847222,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.5,"top":0.34555554,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.5277778,"top":0.3611111,"width":0.025347222,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.5,"top":0.3911111,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.5277778,"top":0.40666667,"width":0.22986111,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.5,"top":0.43666667,"width":0.097222224,"height":0.045555554},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.5277778,"top":0.45222223,"width":0.11840278,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.50590277,"top":0.48444444,"width":0.08576389,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.50590277,"top":0.9583333,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.52881944,"top":0.9583333,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.5520833,"top":0.9583333,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.50590277,"top":0.9211111,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.52881944,"top":0.9211111,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.81666666,"top":0.07666667,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.84166664,"top":0.07666667,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.8361111,"top":0.14444445,"width":0.027777778,"height":0.044444446},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.60555553,"top":0.14444445,"width":0.027777778,"height":0.044444446},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.75,"top":0.14444445,"width":0.027777778,"height":0.044444446},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.7777778,"top":0.14444445,"width":0.027777778,"height":0.044444446},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.8055556,"top":0.14444445,"width":0.027777778,"height":0.044444446},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.59652776,"top":0.20555556,"width":0.00069444446,"height":0.0011111111},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.59652776,"top":0.20888889,"width":0.25069445,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.6111111,"top":0.041666668,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.6333333,"top":0.041666668,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.65555555,"top":0.041666668,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.67777777,"top":0.041666668,"width":0.022222223,"height":0.035555556},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.7,"top":0.041666668,"width":0.022222223,"height":0.035555556},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.64444447,"top":0.1261111,"width":0.027777778,"height":0.044444446},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.675,"top":0.1261111,"width":0.027777778,"height":0.044444446},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.71666664,"top":0.13944444,"width":0.13611111,"height":0.08},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.59652776,"top":0.14055556,"width":0.041666668,"height":0.022777777},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.71666664,"top":0.14166667,"width":0.13333334,"height":0.07611111},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":1.0,"top":0.2638889,"width":-0.085416675,"height":0.044444446},"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.6458333,"top":0.26722223,"width":0.06458333,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.6409722,"top":0.325,"width":0.00069444446,"height":0.0011111111},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.6409722,"top":0.32777777,"width":0.08576389,"height":0.026666667},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.6180556,"top":0.32944444,"width":0.23680556,"height":0.13833334},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.6180556,"top":0.4861111,"width":0.228125,"height":0.022777777},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.6180556,"top":0.515,"width":0.13263889,"height":0.022777777},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.6180556,"top":0.515,"width":0.23645833,"height":0.08055556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.6180556,"top":0.6294444,"width":0.24027778,"height":0.053333335},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.6180556,"top":0.63166666,"width":0.20902778,"height":0.049444444},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.6180556,"top":0.695,"width":0.15243055,"height":0.022777777},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.6180556,"top":0.695,"width":0.21840277,"height":0.051666666},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.6180556,"top":0.7238889,"width":0.19861111,"height":0.051666666},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.70104164,"top":0.75277776,"width":0.067708336,"height":0.022777777},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.76875,"top":0.75277776,"width":0.0027777778,"height":0.022777777},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.6180556,"top":0.79388887,"width":0.23993056,"height":0.10944445},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.64444447,"top":0.9216667,"width":0.08506945,"height":0.022777777},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.64444447,"top":0.9216667,"width":0.21388888,"height":0.07833332},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.64444447,"top":1.0,"width":0.08125,"height":-0.07833338},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.64444447,"top":1.0,"width":0.21354167,"height":-0.07833338},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.625,"top":0.75166667,"width":0.22222222,"height":0.026666667},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.625,"top":0.75222224,"width":0.06284722,"height":0.025555555},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.6166667,"top":0.8016667,"width":0.027777778,"height":0.044444446},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.65,"top":0.8016667,"width":0.027777778,"height":0.044444446},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.7690972,"top":0.79833335,"width":0.05451389,"height":0.044444446},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.78020835,"top":0.8105556,"width":0.015625,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.8277778,"top":0.79833335,"width":0.027777778,"height":0.044444446},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.84097224,"top":0.7972222,"width":0.029166667,"height":0.046666667},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.62048614,"top":0.87333333,"width":0.23125,"height":0.017222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"bounds":{"left":0.68993056,"top":0.89111114,"width":0.09236111,"height":0.017222222},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"bounds":{"left":0.68993056,"top":0.89111114,"width":0.09236111,"height":0.017222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"bounds":{"left":0.59652776,"top":0.89,"width":0.090277776,"height":0.017222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"bounds":{"left":0.60833335,"top":0.9405556,"width":0.11180556,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"bounds":{"left":0.6201389,"top":0.94777775,"width":0.088194445,"height":0.02111111},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Screenpipe [archive.db · 3234.2MB]","depth":7,"bounds":{"left":0.8892361,"top":0.08111111,"width":0.10277778,"height":0.046666667},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Screenpipe","depth":8,"bounds":{"left":0.8892361,"top":0.083333336,"width":0.05451389,"height":0.018888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"[archive.db · 3234.2MB]","depth":9,"bounds":{"left":0.8892361,"top":0.08777778,"width":0.09652778,"height":0.03722222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Activity","depth":7,"bounds":{"left":0.8892361,"top":0.19888888,"width":0.050347224,"height":0.02388889},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Search","depth":7,"bounds":{"left":0.94166666,"top":0.19888888,"width":0.050347224,"height":0.02388889},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Audio","depth":7,"bounds":{"left":0.8892361,"top":0.22611111,"width":0.050347224,"height":0.03888889},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Work Report","depth":7,"bounds":{"left":0.94166666,"top":0.22611111,"width":0.050347224,"height":0.03888889},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Timetable","depth":7,"bounds":{"left":0.8892361,"top":0.26833335,"width":0.050347224,"height":0.03888889},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Summary","depth":7,"bounds":{"left":0.94166666,"top":0.26833335,"width":0.050347224,"height":0.03888889},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Date","depth":8,"bounds":{"left":0.8892361,"top":0.13833334,"width":0.017013889,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12","depth":9,"bounds":{"left":0.8975694,"top":0.16944444,"width":0.009027778,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"bounds":{"left":0.90868056,"top":0.16944444,"width":0.0048611113,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"05","depth":9,"bounds":{"left":0.915625,"top":0.16944444,"width":0.009027778,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"bounds":{"left":0.9267361,"top":0.16944444,"width":0.004513889,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026","depth":9,"bounds":{"left":0.93333334,"top":0.16944444,"width":0.018402778,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Calendar","depth":8,"bounds":{"left":0.9545139,"top":0.17,"width":0.010069445,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"TOTAL SPAN","depth":11,"bounds":{"left":0.8982639,"top":0.34666666,"width":0.049652778,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12h 25m","depth":11,"bounds":{"left":0.8982639,"top":0.36666667,"width":0.05138889,"height":0.02388889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"09:21 → 21:46","depth":11,"bounds":{"left":0.8982639,"top":0.39555556,"width":0.06458333,"height":0.018888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ACTIVE TIME","depth":11,"bounds":{"left":0.8982639,"top":0.42944443,"width":0.053125,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(WALL CLOCK)","depth":11,"bounds":{"left":0.8982639,"top":0.43166667,"width":0.07430556,"height":0.030555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"9h 20m","depth":11,"bounds":{"left":0.8982639,"top":0.4672222,"width":0.041666668,"height":0.02111111},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"BREAKS","depth":11,"bounds":{"left":0.9059028,"top":0.5038889,"width":0.03125,"height":0.015},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2 · 3h 5m","depth":11,"bounds":{"left":0.9059028,"top":0.5233333,"width":0.050347224,"height":0.02111111},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SESSIONS — CLICK TO FILTER","depth":11,"bounds":{"left":0.8982639,"top":0.56,"width":0.07604167,"height":0.033333335},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S1: 09:21–16:08 (6h 47m)","depth":12,"bounds":{"left":0.90520835,"top":0.6066667,"width":0.058333334,"height":0.033333335},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S2: 16:33–17:41 (1h 8m)","depth":12,"bounds":{"left":0.90520835,"top":0.65444446,"width":0.057638887,"height":0.033333335},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S3: 20:21–21:46 (1h 25m)","depth":12,"bounds":{"left":0.90520835,"top":0.7022222,"width":0.059375,"height":0.033333335},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Click a session segment to filter activity to that time window","depth":10,"bounds":{"left":0.8982639,"top":0.75555557,"width":0.08055556,"height":0.047222223},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S1 6h 47m","depth":11,"bounds":{"left":0.90416664,"top":0.81722224,"width":0.035069443,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S2 1h 8m","depth":11,"bounds":{"left":0.9357639,"top":0.81722224,"width":0.030902777,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"S3 1h 25m","depth":11,"bounds":{"left":0.9600694,"top":0.81722224,"width":0.035069443,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"160m","depth":11,"bounds":{"left":0.9548611,"top":0.81722224,"width":0.018055556,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"09:21","depth":11,"bounds":{"left":0.8902778,"top":0.825,"width":0.017361112,"height":0.012222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"11:25","depth":11,"bounds":{"left":0.9048611,"top":0.825,"width":0.015972223,"height":0.012222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"13:29","depth":11,"bounds":{"left":0.91805553,"top":0.825,"width":0.017013889,"height":0.012222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"15:33","depth":11,"bounds":{"left":0.9322917,"top":0.825,"width":0.017013889,"height":0.012222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"17:37","depth":11,"bounds":{"left":0.9465278,"top":0.825,"width":0.016319444,"height":0.012222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"19:42","depth":11,"bounds":{"left":0.9597222,"top":0.825,"width":0.017361112,"height":0.012222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"21:46","depth":11,"bounds":{"left":0.9736111,"top":0.825,"width":0.017361112,"height":0.012222222},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"FRAMES","depth":10,"bounds":{"left":0.8954861,"top":0.8833333,"width":0.030208332,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"7274","depth":10,"bounds":{"left":0.8954861,"top":0.9033333,"width":0.02673611,"height":0.02111111},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"APPS","depth":10,"bounds":{"left":0.95034724,"top":0.8833333,"width":0.019791666,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"16","depth":10,"bounds":{"left":0.95034724,"top":0.9033333,"width":0.012847222,"height":0.02111111},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"UI EVENTS","depth":10,"bounds":{"left":0.8954861,"top":0.95555556,"width":0.029513888,"height":0.030555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"7044","depth":10,"bounds":{"left":0.8954861,"top":0.99222225,"width":0.028472222,"height":0.0077777505},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AUDIO","depth":10,"bounds":{"left":0.95034724,"top":0.95555556,"width":0.023958333,"height":0.013888889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"319","depth":10,"bounds":{"left":0.95034724,"top":0.97555554,"width":0.02013889,"height":0.02111111},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ACTIVE PERIOD","depth":10,"bounds":{"left":0.8954861,"top":1.0,"width":0.059375,"height":-0.04444444},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(TIMES IN LOCAL TIMEZONE)","depth":10,"bounds":{"left":0.8954861,"top":1.0,"width":0.08541667,"height":-0.04444444},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"09:21 → 21:46","depth":10,"bounds":{"left":0.8954861,"top":1.0,"width":0.06423611,"height":-0.08277774},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TIME PER APP","depth":9,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"— CLICK TO FILTER ALL PANELS BY APP","depth":9,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Firefox","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4h 36m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Slack","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1h 55m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"iTerm2","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1h 17m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"PhpStorm","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"51m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Claude","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"31m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Windsurf","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"19m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Code","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"10m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"QuickTime Player","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"9m","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finder","depth":11,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-4506537694959731087
|
9209072041777048541
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 3234.2MB]
Screenpipe
[archive.db · 3234.2MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
12
/
05
/
2026
Calendar
TOTAL SPAN
12h 25m
09:21 → 21:46
ACTIVE TIME
(WALL CLOCK)
9h 20m
BREAKS
2 · 3h 5m
SESSIONS — CLICK TO FILTER
S1: 09:21–16:08 (6h 47m)
S2: 16:33–17:41 (1h 8m)
S3: 20:21–21:46 (1h 25m)
Click a session segment to filter activity to that time window
S1 6h 47m
S2 1h 8m
S3 1h 25m
160m
09:21
11:25
13:29
15:33
17:37
19:42
21:46
FRAMES
7274
APPS
16
UI EVENTS
7044
AUDIO
319
ACTIVE PERIOD
(TIMES IN LOCAL TIMEZONE)
09:21 → 21:46
TIME PER APP
— CLICK TO FILTER ALL PANELS BY APP
Firefox
4h 36m
Slack
1h 55m
iTerm2
1h 17m
PhpStorm
51m
Claude
31m
Windsurf
19m
Code
10m
QuickTime Player
9m
Finder...
|
39282
|
NULL
|
NULL
|
NULL
|
|
38735
|
1437
|
30
|
2026-05-14T06:27:40.030746+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740060030_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-5860731093613575256
|
9209072041775999965
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38736
|
1439
|
40
|
2026-05-14T06:27:40.009845+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740060009_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false}]...
|
-7705754712096935829
|
9209072041775999965
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38791
|
1437
|
57
|
2026-05-14T06:28:54.678541+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740134678_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false}]...
|
-7615559294481270590
|
9209072041775999965
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu...
|
38788
|
NULL
|
NULL
|
NULL
|
|
38961
|
1441
|
14
|
2026-05-14T06:31:31.598928+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740291598_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false}]...
|
-7705754712096935829
|
9209072041775999965
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools...
|
38959
|
NULL
|
NULL
|
NULL
|
|
38967
|
1442
|
14
|
2026-05-14T06:31:37.799634+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740297799_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.11702128,"top":0.92178774,"width":0.11170213,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"bounds":{"left":0.068484046,"top":0.92098963,"width":0.043218084,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
1146518781610073049
|
9209072041775999965
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window...
|
38966
|
NULL
|
NULL
|
NULL
|
|
32024
|
1241
|
40
|
2026-05-13T09:03:47.689060+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778663027689_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 763MB]
Screenpipe
[archive.db · 763MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
12
/
05
/
2026
Calendar
Monitor
Jump to
--
:
--
Go
APP TIMELINE · CLICK TO PLAY · DRAG SCROLLBAR TO PAN
−
1×
+
Follow
Follow
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
12 May 10:05 · Firefox / JY-20625 | JY-20742 | MCP POC by yalokin-jiminny · Pull Request #12036 · jiminny/app — Work
⏮ 30s
◀ 10s
⏸ Pause
10s ▶
30s ⏭
10:05
Firefox
iTerm2
Slack
CleanShot X
PhpStorm
Finder
QuickTime Player
Alfred
Raycast
Control Centre
Claude
coreautha
Code
Activity Monitor
Windsurf
Anybox...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"New Tab","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.014960106,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.18994413,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.3463687,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.3575419,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.38068634,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.05817819,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"bounds":{"left":0.17885639,"top":0.0,"width":0.0034906915,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21010639,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.011303191,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.120678194,"top":0.0,"width":0.008144947,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"bounds":{"left":0.14960106,"top":0.0,"width":0.020944148,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21542554,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"bounds":{"left":0.09142287,"top":0.029928172,"width":0.028922873,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"bounds":{"left":0.12034574,"top":0.029928172,"width":0.106715426,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"bounds":{"left":0.09142287,"top":0.029928172,"width":0.21991356,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"bounds":{"left":0.09142287,"top":0.050678372,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"bounds":{"left":0.0787899,"top":0.13288109,"width":0.22456782,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"bounds":{"left":0.0787899,"top":0.18715084,"width":0.036402926,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"bounds":{"left":0.091755316,"top":0.19592977,"width":0.017785905,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.14029256,"top":0.292498,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"bounds":{"left":0.16023937,"top":0.30207503,"width":0.13696809,"height":0.09577015},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.3028731,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"bounds":{"left":0.16023937,"top":0.30367118,"width":0.13663563,"height":0.13128492},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"bounds":{"left":0.29720744,"top":0.30207503,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.42976856,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.43216282,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.4736632,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.47565842,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"bounds":{"left":0.0787899,"top":0.47685555,"width":0.025930852,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.106715426,"top":0.47805268,"width":0.011136968,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"bounds":{"left":0.0787899,"top":0.47685555,"width":0.23005319,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"bounds":{"left":0.0787899,"top":0.5271349,"width":0.1341423,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"bounds":{"left":0.21492687,"top":0.528332,"width":0.053025264,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"bounds":{"left":0.0787899,"top":0.5271349,"width":0.22623006,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"bounds":{"left":0.0787899,"top":0.5885874,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"bounds":{"left":0.0787899,"top":0.59018356,"width":0.08194814,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"bounds":{"left":0.0787899,"top":0.61652035,"width":0.20761304,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"bounds":{"left":0.24883644,"top":0.63846767,"width":0.025099734,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"bounds":{"left":0.0787899,"top":0.63727057,"width":0.22240691,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"bounds":{"left":0.105053194,"top":0.65802073,"width":0.11419548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.21924867,"top":0.65802073,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"bounds":{"left":0.0787899,"top":0.6875499,"width":0.22755983,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.27293882,"top":0.7094972,"width":0.011136968,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"bounds":{"left":0.0787899,"top":0.70830005,"width":0.23354389,"height":0.07861133},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"bounds":{"left":0.0787899,"top":0.811253,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"bounds":{"left":0.0787899,"top":0.81284916,"width":0.09840426,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"bounds":{"left":0.0787899,"top":0.83918595,"width":0.23254654,"height":0.07861133},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.117519945,"top":0.90263367,"width":0.011136968,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"bounds":{"left":0.0787899,"top":0.90143657,"width":0.23038563,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"bounds":{"left":0.0787899,"top":0.98363924,"width":0.234375,"height":0.01636076},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"bounds":{"left":0.0787899,"top":0.98523545,"width":0.10405585,"height":0.014764547},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"bounds":{"left":0.0787899,"top":1.0,"width":0.2209109,"height":-0.011572242},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"bounds":{"left":0.14660904,"top":1.0,"width":0.011303191,"height":-0.05426979},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"bounds":{"left":0.0787899,"top":1.0,"width":0.23121676,"height":-0.05307257},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.11702128,"top":0.92178774,"width":0.11170213,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"bounds":{"left":0.068484046,"top":0.92098963,"width":0.043218084,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"bounds":{"left":0.07413564,"top":0.95730245,"width":0.053523935,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"bounds":{"left":0.07978723,"top":0.96249,"width":0.042220745,"height":0.015163607},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Screenpipe [archive.db · 763MB]","depth":7,"bounds":{"left":0.33061835,"top":0.061452515,"width":0.06067154,"height":0.017956903},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Screenpipe","depth":8,"bounds":{"left":0.33061835,"top":0.06304868,"width":0.027759308,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"[archive.db · 763MB]","depth":9,"bounds":{"left":0.35970744,"top":0.06703911,"width":0.03158245,"height":0.009976057},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Activity","depth":7,"bounds":{"left":0.39594415,"top":0.059856344,"width":0.024933511,"height":0.0207502},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Search","depth":7,"bounds":{"left":0.42154256,"top":0.059856344,"width":0.023603724,"height":0.0207502},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Audio","depth":7,"bounds":{"left":0.44581118,"top":0.059856344,"width":0.020944148,"height":0.0207502},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Work Report","depth":7,"bounds":{"left":0.46742022,"top":0.059856344,"width":0.03507314,"height":0.0207502},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Timetable","depth":7,"bounds":{"left":0.5031583,"top":0.059856344,"width":0.029753989,"height":0.0207502},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Summary","depth":7,"bounds":{"left":0.53357714,"top":0.059856344,"width":0.034075797,"height":0.0207502},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Date","depth":8,"bounds":{"left":0.93866354,"top":0.0650439,"width":0.008144947,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12","depth":9,"bounds":{"left":0.95545214,"top":0.06464485,"width":0.0048204786,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"bounds":{"left":0.96127,"top":0.06464485,"width":0.0023271276,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"05","depth":9,"bounds":{"left":0.9645944,"top":0.06464485,"width":0.0048204786,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"/","depth":8,"bounds":{"left":0.97041225,"top":0.06464485,"width":0.002493351,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026","depth":9,"bounds":{"left":0.97390294,"top":0.06464485,"width":0.009474734,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Calendar","depth":8,"bounds":{"left":0.9847075,"top":0.0650439,"width":0.0051529254,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Monitor","depth":9,"bounds":{"left":0.45262632,"top":0.10853951,"width":0.013464096,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Jump to","depth":9,"bounds":{"left":0.8111702,"top":0.10853951,"width":0.01412899,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--","depth":10,"bounds":{"left":0.8312833,"top":0.10814046,"width":0.0048204786,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":":","depth":9,"bounds":{"left":0.83710104,"top":0.10814046,"width":0.0023271276,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"--","depth":10,"bounds":{"left":0.84042555,"top":0.10814046,"width":0.0048204786,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Go","depth":8,"bounds":{"left":0.85920876,"top":0.10454908,"width":0.012300532,"height":0.018754989},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"APP TIMELINE · CLICK TO PLAY · DRAG SCROLLBAR TO PAN","depth":10,"bounds":{"left":0.45761302,"top":0.14964086,"width":0.10571808,"height":0.009976057},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"−","depth":9,"bounds":{"left":0.813996,"top":0.1452514,"width":0.009807181,"height":0.018754989},"on_screen":true,"help_text":"Zoom out","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"1×","depth":10,"bounds":{"left":0.82795876,"top":0.14924182,"width":0.004155585,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"+","depth":9,"bounds":{"left":0.83643615,"top":0.1452514,"width":0.009640957,"height":0.018754989},"on_screen":true,"help_text":"Zoom in","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Follow","depth":10,"bounds":{"left":0.8494016,"top":0.14924182,"width":0.004654255,"height":0.011173184},"on_screen":true,"help_text":"","role_description":"checkbox","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Follow","depth":10,"bounds":{"left":0.85538566,"top":0.14924182,"width":0.011136968,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"09:30","depth":13,"bounds":{"left":0.45894283,"top":0.21947326,"width":0.00880984,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"10:00","depth":13,"bounds":{"left":0.47573137,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"10:30","depth":13,"bounds":{"left":0.4921875,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"11:00","depth":13,"bounds":{"left":0.50880986,"top":0.21947326,"width":0.0078125,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"11:30","depth":13,"bounds":{"left":0.52526593,"top":0.21947326,"width":0.0078125,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12:00","depth":13,"bounds":{"left":0.5415558,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12:30","depth":13,"bounds":{"left":0.55801195,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"13:00","depth":13,"bounds":{"left":0.57430184,"top":0.21947326,"width":0.00831117,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"13:30","depth":13,"bounds":{"left":0.59075797,"top":0.21947326,"width":0.00831117,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"14:00","depth":13,"bounds":{"left":0.6072141,"top":0.21947326,"width":0.00831117,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"14:30","depth":13,"bounds":{"left":0.6236702,"top":0.21947326,"width":0.00831117,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"15:00","depth":13,"bounds":{"left":0.64012635,"top":0.21947326,"width":0.00831117,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"15:30","depth":13,"bounds":{"left":0.6565825,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"16:00","depth":13,"bounds":{"left":0.67303854,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"16:30","depth":13,"bounds":{"left":0.68949467,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"17:00","depth":13,"bounds":{"left":0.70611703,"top":0.21947326,"width":0.007978723,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"17:30","depth":13,"bounds":{"left":0.72257316,"top":0.21947326,"width":0.007978723,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"18:00","depth":13,"bounds":{"left":0.73886305,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"18:30","depth":13,"bounds":{"left":0.7553192,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"19:00","depth":13,"bounds":{"left":0.77177525,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"19:30","depth":13,"bounds":{"left":0.7882314,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"20:00","depth":13,"bounds":{"left":0.804355,"top":0.21947326,"width":0.008643617,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"20:30","depth":13,"bounds":{"left":0.82081115,"top":0.21947326,"width":0.008643617,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"21:00","depth":13,"bounds":{"left":0.83759975,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"21:30","depth":13,"bounds":{"left":0.8540558,"top":0.21947326,"width":0.008144947,"height":0.008778931},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"12 May 10:05 · Firefox / JY-20625 | JY-20742 | MCP POC by yalokin-jiminny · Pull Request #12036 · jiminny/app — Work","depth":10,"bounds":{"left":0.45628324,"top":0.2661612,"width":0.2237367,"height":0.011971269},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"⏮ 30s","depth":9,"bounds":{"left":0.45694813,"top":0.7186752,"width":0.023936171,"height":0.02434158},"on_screen":true,"help_text":"Ctrl+←","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"◀ 10s","depth":9,"bounds":{"left":0.48354387,"top":0.71907425,"width":0.02244016,"height":0.023942538},"on_screen":true,"help_text":"←","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"⏸ Pause","depth":9,"bounds":{"left":0.5086436,"top":0.7186752,"width":0.027925532,"height":0.02434158},"on_screen":true,"help_text":"Space","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"10s ▶","depth":9,"bounds":{"left":0.53922874,"top":0.71907425,"width":0.022273935,"height":0.023942538},"on_screen":true,"help_text":"→","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"30s ⏭","depth":9,"bounds":{"left":0.56416225,"top":0.7186752,"width":0.024102394,"height":0.02434158},"on_screen":true,"help_text":"Ctrl+→","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":true,"is_selected":false},{"role":"AXStaticText","text":"10:05","depth":10,"bounds":{"left":0.85738033,"top":0.7254589,"width":0.009807181,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Firefox","depth":9,"bounds":{"left":0.45761302,"top":0.7609737,"width":0.011801862,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"iTerm2","depth":9,"bounds":{"left":0.47772607,"top":0.7609737,"width":0.011801862,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Slack","depth":9,"bounds":{"left":0.4978391,"top":0.7609737,"width":0.009474734,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"CleanShot X","depth":9,"bounds":{"left":0.515625,"top":0.7609737,"width":0.02144282,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"PhpStorm","depth":9,"bounds":{"left":0.545379,"top":0.7609737,"width":0.017287234,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finder","depth":9,"bounds":{"left":0.5709774,"top":0.7609737,"width":0.010970744,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"QuickTime Player","depth":9,"bounds":{"left":0.5902593,"top":0.7609737,"width":0.03025266,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Alfred","depth":9,"bounds":{"left":0.62882316,"top":0.7609737,"width":0.010472074,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Raycast","depth":9,"bounds":{"left":0.6476064,"top":0.7609737,"width":0.013630319,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Centre","depth":9,"bounds":{"left":0.66954786,"top":0.7609737,"width":0.025598405,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Claude","depth":9,"bounds":{"left":0.7034575,"top":0.7609737,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"coreautha","depth":9,"bounds":{"left":0.72390294,"top":0.7609737,"width":0.017453458,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Code","depth":9,"bounds":{"left":0.7496675,"top":0.7609737,"width":0.00930851,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Activity Monitor","depth":9,"bounds":{"left":0.76728725,"top":0.7609737,"width":0.027426861,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Windsurf","depth":9,"bounds":{"left":0.80302525,"top":0.7609737,"width":0.015791224,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Anybox","depth":9,"bounds":{"left":0.82712764,"top":0.7609737,"width":0.012965426,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-586889167018772289
|
9209072041767611357
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
Screenpipe [archive.db · 763MB]
Screenpipe
[archive.db · 763MB]
Activity
Search
Audio
Work Report
Timetable
AI Summary
Date
12
/
05
/
2026
Calendar
Monitor
Jump to
--
:
--
Go
APP TIMELINE · CLICK TO PLAY · DRAG SCROLLBAR TO PAN
−
1×
+
Follow
Follow
09:30
10:00
10:30
11:00
11:30
12:00
12:30
13:00
13:30
14:00
14:30
15:00
15:30
16:00
16:30
17:00
17:30
18:00
18:30
19:00
19:30
20:00
20:30
21:00
21:30
12 May 10:05 · Firefox / JY-20625 | JY-20742 | MCP POC by yalokin-jiminny · Pull Request #12036 · jiminny/app — Work
⏮ 30s
◀ 10s
⏸ Pause
10s ▶
30s ⏭
10:05
Firefox
iTerm2
Slack
CleanShot X
PhpStorm
Finder
QuickTime Player
Alfred
Raycast
Control Centre
Claude
coreautha
Code
Activity Monitor
Windsurf
Anybox...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38707
|
1439
|
21
|
2026-05-14T06:26:52.678380+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740012678_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.11702128,"top":0.92178774,"width":0.11170213,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-868383271587244932
|
9209071904337046493
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini...
|
38705
|
NULL
|
NULL
|
NULL
|
|
38999
|
1442
|
31
|
2026-05-14T06:32:29.646823+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740349646_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8637438618246780860
|
9209071893608016861
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro...
|
38998
|
NULL
|
NULL
|
NULL
|
|
38958
|
1441
|
12
|
2026-05-14T06:31:29.300469+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740289300_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.0,"top":0.0,"width":0.134375,"height":0.020555556},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false}]...
|
-6207903613934946433
|
9209071893599628253
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message...
|
38956
|
NULL
|
NULL
|
NULL
|
|
38969
|
1442
|
16
|
2026-05-14T06:31:43.809364+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740303809_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. Re-processing and Model Upgrades","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. Re-processing and Model Upgrades","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The Source of Truth for Hallucinations","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The Source of Truth for Hallucinations","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"file is your fallback source of truth to verify what was actually said.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Database Views Them","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Database Views Them","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you were to open up your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and query the tables (e.g.,","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), you would see that the database treats the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip (input)_2026-05-12_07-40-48.mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"What happens if you delete them?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you manually","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"rm","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a \"file not found\" error in the background logs.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Managing the Storage Footprint","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Managing the Storage Footprint","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe records continuously, this folder will inevitably grow over time.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Built-in Garbage Collection:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Archiving:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.02642952,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you prefer to keep a permanent, searchable \"life log\" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22041224,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"older","depth":29,"bounds":{"left":0.18035239,"top":0.0,"width":0.015292553,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.22174202,"height":0.057861134},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.029928172,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"bounds":{"left":0.12566489,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"bounds":{"left":0.14029256,"top":0.0905826,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said yes I will do that. Is there a way to setup languages to transcription?","depth":21,"bounds":{"left":0.16023937,"top":0.100159615,"width":0.15026596,"height":0.03830806},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"bounds":{"left":0.068484046,"top":0.1009577,"width":0.019946808,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes I will do that. Is there a way to setup languages to transcription?","depth":23,"bounds":{"left":0.16023937,"top":0.10175578,"width":0.12849069,"height":0.035514764},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"bounds":{"left":0.3025266,"top":0.17039107,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"bounds":{"left":0.09208777,"top":0.17278531,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"bounds":{"left":0.08976064,"top":0.21428572,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"bounds":{"left":0.08976064,"top":0.21628092,"width":0.04105718,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.","depth":27,"bounds":{"left":0.0787899,"top":0.21747805,"width":0.23088431,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As for setting up languages for transcription,","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.10920878,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"yes, you can configure it","depth":27,"bounds":{"left":0.18799867,"top":0.28850758,"width":0.06333112,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":", but it helps to understand how ScreenPipe handles language natively first.","depth":27,"bounds":{"left":0.0787899,"top":0.28850758,"width":0.20994017,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How the Transcription Engine Handles Language","depth":26,"bounds":{"left":0.0787899,"top":0.3499601,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How the Transcription Engine Handles Language","depth":27,"bounds":{"left":0.0787899,"top":0.35155627,"width":0.12549867,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses a","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.072972074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"multilingual Whisper model","depth":27,"bounds":{"left":0.15176196,"top":0.37789306,"width":0.07047872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"and sets the language configuration to","depth":27,"bounds":{"left":0.0787899,"top":0.37789306,"width":0.23321144,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-Detect","depth":27,"bounds":{"left":0.08510638,"top":0.39864326,"width":0.032247342,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"bounds":{"left":0.11735372,"top":0.39864326,"width":0.0013297872,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.","depth":27,"bounds":{"left":0.0787899,"top":0.42817238,"width":0.23038563,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Advantage:","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.040724736,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.","depth":29,"bounds":{"left":0.09142287,"top":0.47845173,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Drawback:","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.038896278,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.","depth":29,"bounds":{"left":0.09142287,"top":0.5494813,"width":0.20744681,"height":0.09936153},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"How to Force a Specific Language","depth":26,"bounds":{"left":0.0787899,"top":0.67318434,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"How to Force a Specific Language","depth":27,"bounds":{"left":0.0787899,"top":0.67478055,"width":0.08759973,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.","depth":27,"bounds":{"left":0.0787899,"top":0.70111734,"width":0.2278923,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are using the Desktop App UI:","depth":27,"bounds":{"left":0.0787899,"top":0.7721468,"width":0.09275266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Open the ScreenPipe settings.","depth":29,"bounds":{"left":0.09142287,"top":0.801676,"width":0.07347074,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Navigate to the","depth":29,"bounds":{"left":0.09142287,"top":0.8312051,"width":0.038231384,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio","depth":29,"bounds":{"left":0.12965426,"top":0.8312051,"width":0.014960106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.14461437,"top":0.8312051,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"AI/Transcription","depth":29,"bounds":{"left":0.15242687,"top":0.8312051,"width":0.041888297,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"section.","depth":29,"bounds":{"left":0.19431517,"top":0.8312051,"width":0.02044548,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Look for the","depth":29,"bounds":{"left":0.09142287,"top":0.8607342,"width":0.030585106,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper Language","depth":29,"bounds":{"left":0.12200798,"top":0.8607342,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.17037898,"top":0.8607342,"width":0.0078125,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Transcription Language","depth":29,"bounds":{"left":0.17819148,"top":0.8607342,"width":0.061502658,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"dropdown.","depth":29,"bounds":{"left":0.23969415,"top":0.8607342,"width":0.027260639,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Change it from \"Auto\" to your specific language (e.g., English, Bulgarian, or Slovak).","depth":29,"bounds":{"left":0.09142287,"top":0.8902634,"width":0.20079787,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you are running ScreenPipe via CLI/Config:","depth":27,"bounds":{"left":0.0787899,"top":0.92378294,"width":0.116023935,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You can modify your underlying configuration (usually found in","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.15159574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/pipe.json","depth":28,"bounds":{"left":0.23238032,"top":0.94573027,"width":0.064328454,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:","depth":27,"bounds":{"left":0.0787899,"top":0.9445331,"width":0.234375,"height":0.05546689},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Slovak)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.020777926,"height":-0.015562654},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(Bulgarian)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.027925532,"height":-0.04509175},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(English)","depth":29,"bounds":{"left":0.14012633,"top":1.0,"width":0.02244016,"height":-0.07462096},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false}]...
|
-6207903613934946433
|
9209071893599628253
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
.
When you search your ScreenPipe history for a specific conversation, the UI doesn't just show you the text—it uses the file path stored in the database to pull up that specific
.mp4
file and starts playing it at the exact second the keyword was spoken. Without these files, you get the transcript, but you lose the ability to actually listen to the context, tone, or nuance of the original conversation.
2. Re-processing and Model Upgrades
2. Re-processing and Model Upgrades
Local LLMs and transcription models are improving rapidly. Right now, your system might be using a quantized Whisper model to balance CPU load with accuracy. If you later decide to run a heavier, more accurate model—or if a radically better speaker diarization algorithm is released—having the raw
.mp4
files allows you to re-process historical audio. Without the raw input files, your data is permanently locked to the quality of the transcription model you were running on May 12th.
3. The Source of Truth for Hallucinations
3. The Source of Truth for Hallucinations
Speech-to-text engines, even great ones, occasionally hallucinate or misinterpret technical jargon, overlapping voices, or background noise. If a database query returns a completely nonsensical sentence, that
.mp4
file is your fallback source of truth to verify what was actually said.
How the Database Views Them
How the Database Views Them
If you were to open up your
db.sqlite
and query the tables (e.g.,
audio_transcriptions
), you would see that the database treats the
~/.screenpipe/data/
folder essentially as a blob storage backend. The tables contain columns acting as soft foreign keys pointing to filenames like
soundcore AeroClip (input)_2026-05-12_07-40-48.mp4
.
What happens if you delete them?
If you manually
rm
those files to free up space on your Mac's SSD, ScreenPipe won't crash. The SQLite database will still hold all your searchable text. However, any attempt to play back the audio for those older transcripts will silently fail or throw a "file not found" error in the background logs.
Managing the Storage Footprint
Managing the Storage Footprint
Because ScreenPipe records continuously, this folder will inevitably grow over time.
Built-in Garbage Collection:
ScreenPipe usually handles this automatically based on retention settings (e.g., deleting media older than 30 days or when the folder hits a specific GB limit).
Archiving:
If you prefer to keep a permanent, searchable "life log" without eating up your local SSD space, you don't necessarily have to let ScreenPipe delete them. You could script a simple cron job to regularly
older
files to your massive 4-drive RAID 5 array, replacing the local files with symlinks. This would keep the database references intact while offloading the heavy media storage to your NAS.
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
Edit
You said yes I will do that. Is there a way to setup languages to transcription?
You said
yes I will do that. Is there a way to setup languages to transcription?
Listen
Show thinking
Gemini said
Gemini said
That is a solid plan. Offloading those older media files to your 4-drive RAID 5 setup (or even the secondary SSD-only Beelink NAS if you end up building it) will keep your local SSD lean while keeping your database searchable.
As for setting up languages for transcription,
yes, you can configure it
, but it helps to understand how ScreenPipe handles language natively first.
How the Transcription Engine Handles Language
How the Transcription Engine Handles Language
By default, ScreenPipe uses a
multilingual Whisper model
and sets the language configuration to
Auto-Detect
.
When a new audio chunk comes in, the model listens to the first few seconds of audio, guesses the language, and transcribes the rest of that chunk using that assumption.
The Advantage:
If you are watching a tutorial in English, and then an hour later you are in a Zoom call speaking Bulgarian, ScreenPipe will automatically switch gears without you having to touch the settings.
The Drawback:
Auto-detect can sometimes guess wrong, especially if there is heavy background noise, if someone starts a sentence with a foreign loanword, or if you are rapidly code-switching between Slovak and English in the same conversation. When it guesses wrong, it might try to force Slovak words into English spelling, resulting in gibberish.
How to Force a Specific Language
How to Force a Specific Language
If you want to ensure maximum accuracy and you know you will primarily be recording in one language for a specific period, you can explicitly set the language. This also slightly speeds up the processing because the model skips the detection phase.
If you are using the Desktop App UI:
Open the ScreenPipe settings.
Navigate to the
Audio
or
AI/Transcription
section.
Look for the
Whisper Language
or
Transcription Language
dropdown.
Change it from "Auto" to your specific language (e.g., English, Bulgarian, or Slovak).
If you are running ScreenPipe via CLI/Config:
You can modify your underlying configuration (usually found in
~/.screenpipe/pipe.json
or by passing a flag when starting the daemon) to force a specific language code. You would use the standard ISO language codes:
(Slovak)
(Bulgarian)
(English)
If you find yourself frequently jumping between those three languages, your best bet is to leave it on Auto-Detect and perhaps run a slightly larger, more accurate Whisper model (like Whisper 'Small' or 'Base' instead of 'Tiny') if your hardware has the headroom for it.
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message...
|
38968
|
NULL
|
NULL
|
NULL
|
|
38918
|
1438
|
14
|
2026-05-14T06:30:48.884147+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740248884_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-5112732177169775684
|
9207951448313875351
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive....
|
38915
|
NULL
|
NULL
|
NULL
|
|
38782
|
1437
|
53
|
2026-05-14T06:28:47.678727+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740127678_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-2710080693951104241
|
9207950350949731223
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/...
|
38780
|
NULL
|
NULL
|
NULL
|
|
38801
|
1437
|
62
|
2026-05-14T06:29:12.179748+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740152179_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-6068333804328483113
|
9207950348869356439
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39260
|
1443
|
60
|
2026-05-14T06:40:30.985499+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740830985_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-2732966694117156214
|
9207950348869356439
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38792
|
1437
|
58
|
2026-05-14T06:28:59.467321+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740139467_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4027755722267345966
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38815
|
1439
|
81
|
2026-05-14T06:29:17.132831+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740157132_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"}]...
|
-6359130848236053429
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38819
|
1439
|
83
|
2026-05-14T06:29:18.999053+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740158999_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
2859498591895944029
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38827
|
1439
|
88
|
2026-05-14T06:29:25.593328+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740165593_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
2859498591895944029
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the...
|
38825
|
NULL
|
NULL
|
NULL
|
|
38848
|
1440
|
1
|
2026-05-14T06:29:49.772604+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740189772_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4258317559196419736
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with...
|
38846
|
NULL
|
NULL
|
NULL
|
|
38853
|
1440
|
4
|
2026-05-14T06:29:53.459400+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740193459_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"}]...
|
-6359130848236053429
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38911
|
1440
|
36
|
2026-05-14T06:30:39.369765+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740239369_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"}]...
|
-6359130848236053429
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38998
|
1442
|
30
|
2026-05-14T06:32:28.732535+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740348732_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
8329327849920350305
|
9207950348869356183
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
41406
|
1531
|
2
|
2026-05-14T10:28:45.906505+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778754525906_m1.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.48576388,"top":0.0,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.5086806,"top":0.0,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.53194445,"top":0.0,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.5552083,"top":0.0,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.5784722,"top":0.0,"width":0.022222223,"height":0.035555556},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-5841410641967677039
|
9207950348869356183
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38843
|
1439
|
97
|
2026-05-14T06:29:41.765023+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740181765_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-6646203381232203441
|
9207950348869323415
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38901
|
1440
|
30
|
2026-05-14T06:30:30.156450+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740230156_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4157554656475024107
|
9207950348869323415
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
....
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38990
|
1442
|
26
|
2026-05-14T06:32:23.461893+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740343461_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4157554656475024107
|
9207950348869323415
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references
....
|
NULL
|
NULL
|
NULL
|
NULL
|
|
39024
|
1442
|
43
|
2026-05-14T06:33:03.215996+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740383215_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"db.sqlite","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"database along with","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"precise timestamps and file path references","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-6646203381232203441
|
9207950348869323415
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback
The primary reason ScreenPipe keeps these files is for audio playback. When Whisper transcribes your meetings or ambient audio, it writes the text into the
db.sqlite
database along with
precise timestamps and file path references...
|
39022
|
NULL
|
NULL
|
NULL
|
|
38868
|
1440
|
12
|
2026-05-14T06:30:07.016030+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740207016_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system....
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
4204195819504196869
|
9207950348802247319
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system....
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38888
|
1440
|
23
|
2026-05-14T06:30:19.808140+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740219808_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-6561738457798499810
|
9207950348802247319
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
38886
|
NULL
|
NULL
|
NULL
|
|
38940
|
1442
|
1
|
2026-05-14T06:31:20.871093+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-14/1778 /Users/lukas/.screenpipe/data/data/2026-05-14/1778740280871_m2.jpg...
|
Firefox
|
Screenpipe — Archive — Personal
|
1
|
app.screenpipe.lakylak.xyz
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXButton","text":"Mute tab","depth":5,"bounds":{"left":0.011469414,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.020113032,"top":0.16121309,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.15722266,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.34796488,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"on_screen":false,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Expand","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Yes, those","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":".mp4","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is exactly what those files are doing sitting in your","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe/data/","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"directory and how they interact with the rest of the system.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"1. The \"Time Machine\" Playback","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. The \"Time Machine\" Playback","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-6561738457798499810
|
9207950348802247319
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Screenpipe — Archive
Mute tab
Screenpipe — Archive
Close tab
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options
Copy prompt
You said I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
You said
I see the files in ~/.screenpipe/data/ as LakyLak bose qc35 II (input)_2026-05-12_06-49-17.mp4 or MacBook Pro Microphone (input)_2026-05-12_12-17-23.mp4 or soundcore AeroClip (input)_2026-05-12_07-40-48.mp4 or System Audio (output)_2026-05-11_06-17-14.mp4. Are these used for anything after transcribing and storing in sqlite db?
Expand
Listen
Show thinking
Gemini said
Gemini said
Yes, those
.mp4
files are absolutely still used after the transcription process is complete. They are not temporary processing chunks; they act as your permanent media archive.
Here is exactly what those files are doing sitting in your
~/.screenpipe/data/
directory and how they interact with the rest of the system.
1. The "Time Machine" Playback
1. The "Time Machine" Playback...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
30867
|
1217
|
17
|
2026-05-13T08:05:59.058997+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778659559058_m1.jpg...
|
Firefox
|
DXP4800PLUS-B5F8 — Personal
|
1
|
nas.lakylak.xyz/desktop/#/
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
21.1
KB/s
14.6
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel
Search
Connection & Access
User Management
File Service
Device Connection
Domain/LDAP
Terminal
General
Hardware & Power
Time & Language
Network
Security
Indexing Service
Service
About
Update & Restore
Telnet
Enable
Enable
Port
23
Advanced settings
SSH
Enable
Enable
Port
22...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":true,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"21.1","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"14.6","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Files","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Storage","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"App Center","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Logs","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Support","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Task Manager","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Music","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Cloud Drives","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Theater","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Photos","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Online Office","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TextEdit","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Virtual Machine","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Downloads","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"DLNA","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File Version Explorer","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Security","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Jellyfin-HT","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SAN Manager","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Vault","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Snapshot","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Comics","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sync & Backup","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"UGREEN AI","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Recycle Bin","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"","depth":13,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"","depth":14,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"Search","depth":18,"on_screen":true,"help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Connection & Access","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"User Management","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File Service","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Device Connection","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Domain/LDAP","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Terminal","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"General","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Hardware & Power","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Time & Language","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Network","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Security","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Indexing Service","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Service","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"About","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Update & Restore","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Telnet","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Enable","depth":18,"on_screen":true,"help_text":"","role_description":"checkbox","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Enable","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Port","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"23","depth":20,"on_screen":true,"value":"23","help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Advanced settings","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SSH","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":20,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Enable","depth":18,"on_screen":true,"help_text":"","role_description":"checkbox","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Enable","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Port","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"22","depth":20,"on_screen":true,"value":"22","help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
4829425383809299798
|
9207947049725643671
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
21.1
KB/s
14.6
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel
Search
Connection & Access
User Management
File Service
Device Connection
Domain/LDAP
Terminal
General
Hardware & Power
Time & Language
Network
Security
Indexing Service
Service
About
Update & Restore
Telnet
Enable
Enable
Port
23
Advanced settings
SSH
Enable
Enable
Port
22...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
30868
|
1218
|
20
|
2026-05-13T08:05:59.056705+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778659559056_m2.jpg...
|
Firefox
|
DXP4800PLUS-B5F8 — Personal
|
1
|
nas.lakylak.xyz/desktop/#/
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
21.1
KB/s
14.6
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel
Search
Connection & Access
User Management
File Service
Device Connection
Domain/LDAP
Terminal
General
Hardware & Power
Time & Language
Network
Security
Indexing Service
Service
About
Update & Restore
Telnet
Enable
Enable
Port
23
Advanced settings...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.1245012,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"New Tab","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.014960106,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.3463687,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.3575419,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.38068634,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"bounds":{"left":0.15674867,"top":0.0,"width":0.008643617,"height":0.015961692},"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.029753989,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21708776,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21958111,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"bounds":{"left":0.203125,"top":0.0,"width":0.008643617,"height":0.015961692},"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"bounds":{"left":0.09142287,"top":0.018754989,"width":0.021775266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"bounds":{"left":0.11319814,"top":0.018754989,"width":0.12849069,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"bounds":{"left":0.24168883,"top":0.018754989,"width":0.043218084,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"bounds":{"left":0.09142287,"top":0.018754989,"width":0.21392952,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"bounds":{"left":0.0787899,"top":0.1009577,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"bounds":{"left":0.0787899,"top":0.102553874,"width":0.09690824,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"bounds":{"left":0.0787899,"top":0.12889066,"width":0.0887633,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"bounds":{"left":0.16755319,"top":0.12889066,"width":0.018284574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"bounds":{"left":0.18583776,"top":0.12889066,"width":0.036070477,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"bounds":{"left":0.22190824,"top":0.12889066,"width":0.015458777,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"bounds":{"left":0.0787899,"top":0.12889066,"width":0.234375,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"bounds":{"left":0.09142287,"top":0.17917,"width":0.025764627,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"bounds":{"left":0.09142287,"top":0.17917,"width":0.2122673,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"bounds":{"left":0.09142287,"top":0.22944932,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"bounds":{"left":0.09142287,"top":0.22944932,"width":0.22057846,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"bounds":{"left":0.09142287,"top":0.27972865,"width":0.031416222,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"bounds":{"left":0.09142287,"top":0.27972865,"width":0.21991356,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"bounds":{"left":0.0787899,"top":0.36193135,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"bounds":{"left":0.0787899,"top":0.36352754,"width":0.09823803,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"bounds":{"left":0.0787899,"top":0.38986433,"width":0.20910904,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"bounds":{"left":0.080784574,"top":0.41181165,"width":0.036236703,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"bounds":{"left":0.0787899,"top":0.41061452,"width":0.22988696,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"bounds":{"left":0.09142287,"top":0.46089387,"width":0.05817819,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"bounds":{"left":0.17885639,"top":0.46089387,"width":0.0034906915,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"bounds":{"left":0.09142287,"top":0.46089387,"width":0.21010639,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"bounds":{"left":0.09142287,"top":0.5111732,"width":0.011303191,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.120678194,"top":0.5111732,"width":0.008144947,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"bounds":{"left":0.14960106,"top":0.5111732,"width":0.020944148,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"bounds":{"left":0.09142287,"top":0.5111732,"width":0.21542554,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"bounds":{"left":0.09142287,"top":0.5614525,"width":0.028922873,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"bounds":{"left":0.12034574,"top":0.5614525,"width":0.106715426,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"bounds":{"left":0.09142287,"top":0.5614525,"width":0.21991356,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"bounds":{"left":0.09142287,"top":0.58220273,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"bounds":{"left":0.0787899,"top":0.6644054,"width":0.22456782,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"bounds":{"left":0.0787899,"top":0.7186752,"width":0.036402926,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"bounds":{"left":0.091755316,"top":0.7274541,"width":0.017785905,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"bounds":{"left":0.075465426,"top":0.7633679,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"bounds":{"left":0.08610372,"top":0.7633679,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"bounds":{"left":0.09674202,"top":0.7633679,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"bounds":{"left":0.107380316,"top":0.7633679,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"bounds":{"left":0.11801862,"top":0.7633679,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"bounds":{"left":0.12865691,"top":0.7633679,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":true,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.11702128,"top":0.92178774,"width":0.11170213,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"bounds":{"left":0.068484046,"top":0.92098963,"width":0.043218084,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"bounds":{"left":0.07413564,"top":0.95730245,"width":0.053523935,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"bounds":{"left":0.07978723,"top":0.96249,"width":0.042220745,"height":0.015163607},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":18,"bounds":{"left":0.9772274,"top":0.06304868,"width":0.0066489363,"height":0.015961692},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"21.1","depth":16,"bounds":{"left":0.92486703,"top":0.06264964,"width":0.0071476065,"height":0.008379889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"bounds":{"left":0.93201464,"top":0.06304868,"width":0.005984043,"height":0.0075818035},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"14.6","depth":16,"bounds":{"left":0.92486703,"top":0.07222666,"width":0.0071476065,"height":0.008379889},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"bounds":{"left":0.93201464,"top":0.0726257,"width":0.005984043,"height":0.0075818035},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Files","depth":13,"bounds":{"left":0.34690824,"top":0.1707901,"width":0.009973404,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"bounds":{"left":0.33776596,"top":0.2697526,"width":0.02825798,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Storage","depth":13,"bounds":{"left":0.34375,"top":0.36871508,"width":0.016289894,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"App Center","depth":13,"bounds":{"left":0.34009308,"top":0.46767756,"width":0.023603724,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Logs","depth":13,"bounds":{"left":0.34690824,"top":0.5666401,"width":0.009973404,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Support","depth":13,"bounds":{"left":0.34375,"top":0.66560256,"width":0.016289894,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Task Manager","depth":13,"bounds":{"left":0.33726728,"top":0.76456505,"width":0.02925532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Music","depth":13,"bounds":{"left":0.34574467,"top":0.86352754,"width":0.012300532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Cloud Drives","depth":13,"bounds":{"left":0.38646942,"top":0.1707901,"width":0.026595745,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Theater","depth":13,"bounds":{"left":0.39178857,"top":0.2697526,"width":0.015957447,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Photos","depth":13,"bounds":{"left":0.39245346,"top":0.36871508,"width":0.01462766,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Online Office","depth":13,"bounds":{"left":0.3863032,"top":0.46767756,"width":0.026928192,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TextEdit","depth":13,"bounds":{"left":0.39145613,"top":0.5666401,"width":0.01662234,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Virtual Machine","depth":13,"bounds":{"left":0.38380983,"top":0.66560256,"width":0.031914894,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Downloads","depth":13,"bounds":{"left":0.3882979,"top":0.76456505,"width":0.022938829,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"DLNA","depth":13,"bounds":{"left":0.39361703,"top":0.86352754,"width":0.012300532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File Version Explorer","depth":13,"bounds":{"left":0.4261968,"top":0.1707901,"width":0.04288564,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Security","depth":13,"bounds":{"left":0.43916222,"top":0.2697526,"width":0.016954787,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Jellyfin-HT","depth":13,"bounds":{"left":0.43666887,"top":0.36871508,"width":0.021941489,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SAN Manager","depth":13,"bounds":{"left":0.43301198,"top":0.46767756,"width":0.02925532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Vault","depth":13,"bounds":{"left":0.4424867,"top":0.5666401,"width":0.010305851,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Snapshot","depth":13,"bounds":{"left":0.43783244,"top":0.66560256,"width":0.019614361,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Comics","depth":13,"bounds":{"left":0.4398271,"top":0.76456505,"width":0.015625,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sync & Backup","depth":13,"bounds":{"left":0.4318484,"top":0.86352754,"width":0.03158245,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"UGREEN AI","depth":13,"bounds":{"left":0.48271278,"top":0.1707901,"width":0.025598405,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Recycle Bin","depth":13,"bounds":{"left":0.48321143,"top":0.2697526,"width":0.024601065,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"bounds":{"left":0.5965758,"top":0.1348763,"width":0.025930852,"height":0.011173184},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"","depth":13,"bounds":{"left":0.741855,"top":0.13088587,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"","depth":14,"bounds":{"left":0.74318486,"top":0.13407822,"width":0.005319149,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":19,"bounds":{"left":0.47755983,"top":0.1707901,"width":0.004654255,"height":0.011572227},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"Search","depth":18,"bounds":{"left":0.48487368,"top":0.16360734,"width":0.028922873,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Connection & Access","depth":19,"bounds":{"left":0.46392953,"top":0.21468475,"width":0.037898935,"height":0.011173184},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"User Management","depth":21,"bounds":{"left":0.4739029,"top":0.2490024,"width":0.040059842,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File Service","depth":21,"bounds":{"left":0.4739029,"top":0.28731045,"width":0.025930852,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Device Connection","depth":21,"bounds":{"left":0.4739029,"top":0.3256185,"width":0.025598405,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Domain/LDAP","depth":21,"bounds":{"left":0.4739029,"top":0.3830806,"width":0.031083776,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Terminal","depth":21,"bounds":{"left":0.4739029,"top":0.42138866,"width":0.019115692,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"General","depth":19,"bounds":{"left":0.46392953,"top":0.46049482,"width":0.01412899,"height":0.011173184},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Hardware & Power","depth":21,"bounds":{"left":0.4739029,"top":0.49481246,"width":0.04105718,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Time & Language","depth":21,"bounds":{"left":0.4739029,"top":0.5331205,"width":0.03873005,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Network","depth":21,"bounds":{"left":0.4739029,"top":0.5714286,"width":0.018284574,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Security","depth":21,"bounds":{"left":0.4739029,"top":0.6097366,"width":0.018284574,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Indexing Service","depth":21,"bounds":{"left":0.4739029,"top":0.6480447,"width":0.036901597,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Service","depth":19,"bounds":{"left":0.46392953,"top":0.68715084,"width":0.013297873,"height":0.011173184},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"About","depth":21,"bounds":{"left":0.4739029,"top":0.72146845,"width":0.013464096,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Update & Restore","depth":21,"bounds":{"left":0.4739029,"top":0.75977653,"width":0.0390625,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Telnet","depth":18,"bounds":{"left":0.5365692,"top":0.18515563,"width":0.012466756,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Enable","depth":18,"bounds":{"left":0.55734706,"top":0.18635276,"width":0.004654255,"height":0.011173184},"on_screen":true,"help_text":"","role_description":"checkbox","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Enable","depth":18,"bounds":{"left":0.5646609,"top":0.18515563,"width":0.014461436,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Port","depth":19,"bounds":{"left":0.56432843,"top":0.21388668,"width":0.008477394,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"23","depth":20,"bounds":{"left":0.57978725,"top":0.20830008,"width":0.06781915,"height":0.023942538},"on_screen":true,"value":"23","help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Advanced settings","depth":18,"bounds":{"left":0.55734706,"top":0.24381484,"width":0.0546875,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":false,"is_selected":false}]...
|
4723601339976516027
|
9207947049725643671
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
21.1
KB/s
14.6
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel
Search
Connection & Access
User Management
File Service
Device Connection
Domain/LDAP
Terminal
General
Hardware & Power
Time & Language
Network
Security
Indexing Service
Service
About
Update & Restore
Telnet
Enable
Enable
Port
23
Advanced settings...
|
30866
|
NULL
|
NULL
|
NULL
|
|
30860
|
1217
|
14
|
2026-05-13T08:05:47.070729+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778659547070_m1.jpg...
|
Firefox
|
DXP4800PLUS-B5F8 — Personal
|
1
|
nas.lakylak.xyz/desktop/#/
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
11.5
KB/s
19
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel
Search
Connection & Access
User Management
File Service
Device Connection
Domain/LDAP...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":true,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"11.5","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"19","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Files","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Storage","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"App Center","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Logs","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Support","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Task Manager","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Music","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Cloud Drives","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Theater","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Photos","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Online Office","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TextEdit","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Virtual Machine","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Downloads","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"DLNA","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File Version Explorer","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Security","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Jellyfin-HT","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SAN Manager","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Vault","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Snapshot","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Comics","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sync & Backup","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"UGREEN AI","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Recycle Bin","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"","depth":13,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"","depth":14,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"Search","depth":18,"on_screen":true,"help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Connection & Access","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"User Management","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File Service","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Device Connection","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Domain/LDAP","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
2376196041007949805
|
9207945952361491351
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
11.5
KB/s
19
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel
Search
Connection & Access
User Management
File Service
Device Connection
Domain/LDAP...
|
30859
|
NULL
|
NULL
|
NULL
|
|
30843
|
1218
|
6
|
2026-05-13T08:05:29.059359+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778659529059_m2.jpg...
|
Firefox
|
DXP4800PLUS-B5F8 — Personal
|
1
|
nas.lakylak.xyz/desktop/#/
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.1245012,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"New Tab","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.014960106,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.3463687,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.3575419,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.38068634,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"bounds":{"left":0.0787899,"top":0.0,"width":0.079953454,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.030418882,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"bounds":{"left":0.12184176,"top":0.0,"width":0.15558511,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21442819,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"bounds":{"left":0.13297872,"top":0.0,"width":0.042054523,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21675532,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"bounds":{"left":0.15674867,"top":0.021947326,"width":0.008643617,"height":0.015961692},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"bounds":{"left":0.09142287,"top":0.051077414,"width":0.029753989,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"bounds":{"left":0.09142287,"top":0.051077414,"width":0.21708776,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"bounds":{"left":0.09142287,"top":0.07182761,"width":0.21958111,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"bounds":{"left":0.203125,"top":0.09297685,"width":0.008643617,"height":0.015961692},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"bounds":{"left":0.09142287,"top":0.12210695,"width":0.021775266,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"bounds":{"left":0.11319814,"top":0.12210695,"width":0.12849069,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"bounds":{"left":0.24168883,"top":0.12210695,"width":0.043218084,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"bounds":{"left":0.09142287,"top":0.12210695,"width":0.21392952,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"bounds":{"left":0.0787899,"top":0.20430966,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"bounds":{"left":0.0787899,"top":0.20590582,"width":0.09690824,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"bounds":{"left":0.0787899,"top":0.23224261,"width":0.0887633,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"bounds":{"left":0.16755319,"top":0.23224261,"width":0.018284574,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"bounds":{"left":0.18583776,"top":0.23224261,"width":0.036070477,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"bounds":{"left":0.22190824,"top":0.23224261,"width":0.015458777,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"bounds":{"left":0.0787899,"top":0.23224261,"width":0.234375,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"bounds":{"left":0.09142287,"top":0.28252193,"width":0.025764627,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"bounds":{"left":0.09142287,"top":0.28252193,"width":0.2122673,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"bounds":{"left":0.09142287,"top":0.33280128,"width":0.04837101,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"bounds":{"left":0.09142287,"top":0.33280128,"width":0.22057846,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"bounds":{"left":0.09142287,"top":0.3830806,"width":0.031416222,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"bounds":{"left":0.09142287,"top":0.3830806,"width":0.21991356,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"bounds":{"left":0.0787899,"top":0.46528333,"width":0.234375,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"bounds":{"left":0.0787899,"top":0.4668795,"width":0.09823803,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"bounds":{"left":0.0787899,"top":0.49321628,"width":0.20910904,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"bounds":{"left":0.080784574,"top":0.5151636,"width":0.036236703,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"bounds":{"left":0.0787899,"top":0.5139665,"width":0.22988696,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"bounds":{"left":0.09142287,"top":0.5642458,"width":0.05817819,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"bounds":{"left":0.17885639,"top":0.5642458,"width":0.0034906915,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"bounds":{"left":0.09142287,"top":0.5642458,"width":0.21010639,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"bounds":{"left":0.09142287,"top":0.61452514,"width":0.011303191,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.120678194,"top":0.61452514,"width":0.008144947,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"bounds":{"left":0.14960106,"top":0.61452514,"width":0.020944148,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"bounds":{"left":0.09142287,"top":0.61452514,"width":0.21542554,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"bounds":{"left":0.09142287,"top":0.66480446,"width":0.028922873,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"bounds":{"left":0.12034574,"top":0.66480446,"width":0.106715426,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"bounds":{"left":0.09142287,"top":0.66480446,"width":0.21991356,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"bounds":{"left":0.09142287,"top":0.6855547,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"bounds":{"left":0.0787899,"top":0.76775736,"width":0.22456782,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"bounds":{"left":0.0787899,"top":0.82202715,"width":0.036402926,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"bounds":{"left":0.091755316,"top":0.8308061,"width":0.017785905,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"bounds":{"left":0.075465426,"top":0.8667199,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"bounds":{"left":0.08610372,"top":0.8667199,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"bounds":{"left":0.09674202,"top":0.8667199,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"bounds":{"left":0.107380316,"top":0.8667199,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"bounds":{"left":0.11801862,"top":0.8667199,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"bounds":{"left":0.12865691,"top":0.8667199,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"bounds":{"left":0.08211436,"top":0.83439744,"width":0.22573139,"height":0.01915403},"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"bounds":{"left":0.08211436,"top":0.8347965,"width":0.030086435,"height":0.018355945},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"bounds":{"left":0.078125,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"bounds":{"left":0.094082445,"top":0.87031126,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"bounds":{"left":0.27044547,"top":0.867917,"width":0.026097074,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"bounds":{"left":0.2757646,"top":0.87669593,"width":0.007480053,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"bounds":{"left":0.29853722,"top":0.867917,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"bounds":{"left":0.30485374,"top":0.8671189,"width":0.013962766,"height":0.033519555},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":true,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"bounds":{"left":0.11702128,"top":0.92178774,"width":0.11170213,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"bounds":{"left":0.2287234,"top":0.92178774,"width":0.044215426,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"bounds":{"left":0.068484046,"top":0.92098963,"width":0.043218084,"height":0.012370312},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"bounds":{"left":0.07413564,"top":0.95730245,"width":0.053523935,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"bounds":{"left":0.07978723,"top":0.96249,"width":0.042220745,"height":0.015163607},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":18,"bounds":{"left":0.9772274,"top":0.06304868,"width":0.0066489363,"height":0.015961692},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-4753466998424860570
|
9207945950218726295
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
...
|
30841
|
NULL
|
NULL
|
NULL
|
|
30859
|
1217
|
13
|
2026-05-13T08:05:46.317608+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778659546317_m1.jpg...
|
Firefox
|
DXP4800PLUS-B5F8 — Personal
|
1
|
nas.lakylak.xyz/desktop/#/
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
8.8
KB/s
9.3
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":true,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"8.8","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"9.3","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"KB/s","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Files","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Storage","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"App Center","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Logs","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Support","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Task Manager","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Music","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Cloud Drives","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Theater","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Photos","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Online Office","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"TextEdit","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Virtual Machine","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Downloads","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"DLNA","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"File Version Explorer","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Security","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Jellyfin-HT","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SAN Manager","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Vault","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Snapshot","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Comics","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Sync & Backup","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"UGREEN AI","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Recycle Bin","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Control Panel","depth":13,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
-2888843821405040075
|
9207945950218202007
|
app_switch
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page
8.8
KB/s
9.3
KB/s
Files
Control Panel
Storage
App Center
Logs
Support
Task Manager
Music
Cloud Drives
Theater
Photos
Online Office
TextEdit
Virtual Machine
Downloads
DLNA
File Version Explorer
Security
Jellyfin-HT
SAN Manager
Vault
Snapshot
Comics
Sync & Backup
UGREEN AI
Recycle Bin
Control Panel...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
38610
|
1434
|
15
|
2026-05-13T17:51:15.466947+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778694675466_m2.jpg...
|
Firefox
|
Manage extra usage for paid Claude plans | Claude Manage extra usage for paid Claude plans | Claude Help Center — Personal...
|
1
|
claude.ai/settings/usage
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Close tab
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
Close tab
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"bounds":{"left":0.0,"top":0.08459697,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"bounds":{"left":0.013297873,"top":0.09577015,"width":0.029587766,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"bounds":{"left":0.0,"top":0.11731844,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"bounds":{"left":0.013297873,"top":0.12849163,"width":0.036901597,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"New Tab","depth":4,"bounds":{"left":0.0,"top":0.15003991,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"bounds":{"left":0.013297873,"top":0.16121309,"width":0.014960106,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"bounds":{"left":0.0,"top":0.18276137,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"bounds":{"left":0.013297873,"top":0.19393456,"width":0.037898935,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"bounds":{"left":0.0,"top":0.21548285,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"bounds":{"left":0.013297873,"top":0.22665602,"width":0.040724736,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.2482043,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.25937748,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"bounds":{"left":0.0,"top":0.28092578,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"bounds":{"left":0.013297873,"top":0.29209897,"width":0.012134309,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.28810853,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"bounds":{"left":0.0,"top":0.31364724,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"bounds":{"left":0.013297873,"top":0.32482043,"width":0.1100399,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.32083002,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"bounds":{"left":0.0,"top":0.3463687,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"bounds":{"left":0.013297873,"top":0.3575419,"width":0.05668218,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.38068634,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"bounds":{"left":0.29321808,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"bounds":{"left":0.30518618,"top":0.055067837,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"bounds":{"left":0.3025266,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"bounds":{"left":0.07280585,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.08610372,"top":0.10454908,"width":0.028590426,"height":0.030327214},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"bounds":{"left":0.0887633,"top":0.10973663,"width":0.021941489,"height":0.020351157},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"bounds":{"left":0.2613032,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"bounds":{"left":0.27460107,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"bounds":{"left":0.28789893,"top":0.103751,"width":0.013297873,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"bounds":{"left":0.068484046,"top":0.14764565,"width":0.0003324468,"height":0.0007980846},"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"bounds":{"left":0.068484046,"top":0.15003991,"width":0.1200133,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.05817819,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"bounds":{"left":0.17885639,"top":0.0,"width":0.0034906915,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21010639,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.011303191,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"bounds":{"left":0.120678194,"top":0.0,"width":0.008144947,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"bounds":{"left":0.14960106,"top":0.0,"width":0.020944148,"height":0.016360734},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"bounds":{"left":0.09142287,"top":0.0,"width":0.21542554,"height":0.037110932},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"bounds":{"left":0.09142287,"top":0.029928172,"width":0.028922873,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"bounds":{"left":0.12034574,"top":0.029928172,"width":0.106715426,"height":0.016360734},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"bounds":{"left":0.09142287,"top":0.029928172,"width":0.21991356,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"bounds":{"left":0.09142287,"top":0.050678372,"width":0.22174202,"height":0.057861134},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"bounds":{"left":0.0787899,"top":0.13288109,"width":0.22456782,"height":0.037110932},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"bounds":{"left":0.0787899,"top":0.18715084,"width":0.036402926,"height":0.031923383},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"bounds":{"left":0.091755316,"top":0.19592977,"width":0.017785905,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":24,"bounds":{"left":0.075465426,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":24,"bounds":{"left":0.08610372,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share and export","depth":23,"bounds":{"left":0.09674202,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":24,"bounds":{"left":0.107380316,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":23,"bounds":{"left":0.11801862,"top":0.23184358,"width":0.010638298,"height":0.025538707},"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false}]...
|
-8687595196792983741
|
9207943753342581383
|
visual_change
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Close tab
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
Close tab
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Share and export
Copy
Show more options...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
30842
|
1217
|
6
|
2026-05-13T08:05:29.047813+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778659529047_m1.jpg...
|
Firefox
|
DXP4800PLUS-B5F8 — Personal
|
1
|
nas.lakylak.xyz/desktop/#/
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"All docs · AFFiNE","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"All docs · AFFiNE","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"DXP4800PLUS-B5F8","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"DXP4800PLUS-B5F8","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXRadioButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"New Tab","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Screenpipe — Archive","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Screenpipe — Archive","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: archive.db","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: archive.db","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Claude","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Claude","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Manage extra usage for paid Claude plans | Claude Help Center","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXRadioButton","text":"2 TB in 25 MB/s - Google Search","depth":4,"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"2 TB in 25 MB/s - Google Search","depth":5,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New Tab","depth":4,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Close Google Gemini (⌃X)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"AI Chat settings","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Close","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Google Account: Lukáš Koválik (kovaliklukas@gmail.com)","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Main menu","depth":12,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"New chat","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Share conversation","depth":11,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Open menu for conversation actions.","depth":12,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"Conversation with Gemini","depth":15,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Conversation with Gemini","depth":16,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Copy prompt","depth":21,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Edit","depth":21,"on_screen":false,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":21,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"You said","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?","depth":23,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Listen","depth":24,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Show thinking","depth":28,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"Gemini said","depth":23,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Gemini said","depth":24,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"ScreenPipe is fundamentally designed to be a","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"100% local, privacy-first","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":27,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXHeading","text":"1. Where It Starts: The Capture Layer","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1. Where It Starts: The Capture Layer","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"System Audio:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Microphone Input:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Everything you say into your mic.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":false,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"2. How It Gets Transferred and Transcribed","depth":26,"on_screen":false,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2. How It Gets Transferred and Transcribed","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Because ScreenPipe prioritizes local processing, the \"transfer\" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Here is the transcription pipeline:","depth":27,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The Engine:","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The raw audio chunks are fed into a local speech-to-text model.","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"By default, ScreenPipe uses","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"OpenAI Whisper","depth":29,"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Diarization:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"As it transcribes the text, the engine also performs \"diarization\"—a technical term for speaker identification.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"View source details. Opens side panel.","depth":29,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Storage:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The final transcribed text is then indexed into a local","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"SQLite database","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":". This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"3. The \"Work in Progress\" (WIP) Stage","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3. The \"Work in Progress\" (WIP) Stage","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"There is a slight bottleneck between","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"hearing","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the audio and","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"saving","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"the text, which acts as the WIP stage:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Buffering:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Processing Queue:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Finalization:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXHeading","text":"4. Understanding the Folder Structure","depth":26,"on_screen":true,"help_text":"","role_description":"heading","subrole":"AXUnknown"},{"role":"AXStaticText","text":"4. Understanding the Folder Structure","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"~/.screenpipe","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The SQLite Database (","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"):","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is \"done.\"","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"The","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"or","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"folders:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Temp Files:","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"If you see rapidly changing files, temporary","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"chunks, or locked database journals (like","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.","depth":29,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?","depth":27,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Sources","depth":26,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Sources","depth":28,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Good response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bad response","depth":23,"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Redo","depth":23,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Share and export","depth":22,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Copy","depth":23,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Show more options","depth":22,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXTextArea","text":"Ask Gemini","depth":20,"on_screen":true,"value":"Ask Gemini","help_text":"","role_description":"text entry area","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Ask Gemini","depth":21,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Open upload file menu","depth":20,"on_screen":true,"role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Tools","depth":18,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Open mode picker","depth":20,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Pro","depth":23,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXCheckBox","text":"Microphone","depth":19,"on_screen":true,"role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"Send message","depth":19,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":false,"is_focused":true,"is_selected":false},{"role":"AXStaticText","text":"Gemini is AI and can make mistakes, including about people.","depth":17,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Your privacy and Gemini Opens in a new window","depth":17,"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Your privacy and Gemini","depth":18,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Opens in a new window","depth":19,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Summarize page","depth":7,"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Summarize page","depth":9,"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
5524643175593705011
|
9207910765846637463
|
click
|
accessibility
|
NULL
|
Screenpipe — Archive
Screenpipe — Archive
All docs Screenpipe — Archive
Screenpipe — Archive
All docs · AFFiNE
All docs · AFFiNE
DXP4800PLUS-B5F8
DXP4800PLUS-B5F8
Close tab
New Tab
New Tab
Screenpipe — Archive
Screenpipe — Archive
SQLite Web: archive.db
SQLite Web: archive.db
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Claude
Claude
Manage extra usage for paid Claude plans | Claude Help Center
Manage extra usage for paid Claude plans | Claude Help Center
2 TB in 25 MB/s - Google Search
2 TB in 25 MB/s - Google Search
New Tab
Customize sidebar
Close Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
AI Chat settings
Close
Google Account: Lukáš Koválik ([EMAIL])
Main menu
New chat
Gemini
New chat
Share conversation
Open menu for conversation actions.
Conversation with Gemini
Conversation with Gemini
Copy prompt
Edit
You said Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
You said
Exaplin me: how does the ScreenPipe audio recording work? Where did it start? How is it getting transferred and transcribed? Is there some work in progress stage? How can I see from the folder structure what has been done and what hasn't been done?
Listen
Show thinking
Gemini said
Gemini said
Here is a breakdown of how ScreenPipe handles audio recording, from the moment it captures sound to how it organizes the data on your machine.
ScreenPipe is fundamentally designed to be a
100% local, privacy-first
application, meaning the vast majority of its processing happens directly on your hardware without relying on the cloud.
View source details. Opens side panel.
1. Where It Starts: The Capture Layer
1. Where It Starts: The Capture Layer
The process begins locally by tapping directly into your operating system’s audio routing. ScreenPipe simultaneously captures two separate audio streams:
System Audio:
Everything your computer outputs (e.g., the voices of other people in a Zoom or Teams meeting, a YouTube video, or Spotify).
Microphone Input:
Everything you say into your mic.
View source details. Opens side panel.
Because ScreenPipe acts as a 24/7 background process, it continuously listens to these streams and breaks the continuous audio into manageable chunks to prepare them for the next stage.
2. How It Gets Transferred and Transcribed
2. How It Gets Transferred and Transcribed
Because ScreenPipe prioritizes local processing, the "transfer" step is incredibly short. The audio data never leaves your machine unless you explicitly set up a cloud plugin.
Here is the transcription pipeline:
The Engine:
The raw audio chunks are fed into a local speech-to-text model.
By default, ScreenPipe uses
OpenAI Whisper
running locally on your hardware. (Users can also configure it to use cloud providers like Deepgram if they need faster processing, but local Whisper is the standard).
View source details. Opens side panel.
Diarization:
As it transcribes the text, the engine also performs "diarization"—a technical term for speaker identification.
It analyzes the audio to distinguish between your voice and the voices of others, labeling who said what.
View source details. Opens side panel.
Storage:
The final transcribed text is then indexed into a local
SQLite database
. This database uses Full-Text Search (FTS5), which is what allows you to instantly search for a phrase you heard in a meeting three weeks ago.
3. The "Work in Progress" (WIP) Stage
3. The "Work in Progress" (WIP) Stage
There is a slight bottleneck between
hearing
the audio and
saving
the text, which acts as the WIP stage:
Buffering:
Audio is recorded into a temporary buffer in your system's RAM or written as temporary chunk files on your disk.
Processing Queue:
Whisper requires CPU/GPU resources to transcribe. If you are having a rapid, multi-person conversation, the raw audio chunks line up in a processing queue.
Finalization:
Once the AI finishes transcribing a chunk, the text is committed to the SQLite database alongside a timestamp, and the raw audio is compressed and saved so you can play it back later.
4. Understanding the Folder Structure
4. Understanding the Folder Structure
All of ScreenPipe's data lives locally on your machine (usually in a hidden directory like
~/.screenpipe
on Mac/Linux or in your AppData folder on Windows). You can determine what has been processed by looking at a few key areas in this directory:
The SQLite Database (
):
This is the master ledger. If text exists inside this database, it means the audio has been fully transcribed, diarized, and is "done."
The
or
folders:
This is where the compressed, raw audio files and JPEG screenshots are stored permanently. Think of this as the raw archive.
Temp Files:
If you see rapidly changing files, temporary
chunks, or locked database journals (like
), this indicates the system is actively processing a backlog of audio (the WIP stage). Once the engine catches up, these temporary chunk files are usually cleared out or moved to permanent storage, and the database updates.
Are you setting up ScreenPipe for a specific workflow (like summarizing meetings or tracking your daily tasks), or just exploring how the architecture works under the hood?
Sources
Sources
Good response
Bad response
Redo
Share and export
Copy
Show more options
Ask Gemini
Ask Gemini
Open upload file menu
Tools
Open mode picker
Pro
Microphone
Send message
Gemini is AI and can make mistakes, including about people.
Your privacy and Gemini Opens in a new window
Your privacy and Gemini
Opens in a new window
Summarize page
Summarize page...
|
30840
|
NULL
|
NULL
|
NULL
|
|
29415
|
1181
|
34
|
2026-05-13T06:35:08.141913+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778654108141_m2.jpg...
|
Firefox
|
SQLite Web: db.sqlite — Personal
|
1
|
sqlite.screenpipe.lakylak.xyz/audio_transcriptions sqlite.screenpipe.lakylak.xyz/audio_transcriptions/content/?page=2...
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Close SQLite Web: db.sqlite
SQLite Web: db.sqlite
Close tab
New Tab
Customize sidebar
Open Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
sqlite-web 0.7.2
sqlite-web 0.7.2
db.sqlite
db.sqlite
audio_transcriptions
472 rows, showing page 2
Query
Query
table name...
_sqlx_migrations
_sqlx_migrations
audio_chunks
audio_chunks
audio_tags
audio_tags
audio_transcriptions
audio_transcriptions
audio_transcriptions_fts (v)
audio_transcriptions_fts
(v)
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
elements
elements
elements_fts (v)
elements_fts
(v)
elements_fts_config
elements_fts_config
elements_fts_data
elements_fts_data
elements_fts_idx
elements_fts_idx
frames
frames
frames_fts (v)
frames_fts
(v)
frames_fts_config
frames_fts_config
frames_fts_data
frames_fts_data
frames_fts_idx
frames_fts_idx
meetings
meetings
memories
memories
memories_fts (v)
memories_fts
(v)
memories_fts_config
memories_fts_config
memories_fts_data
memories_fts_data
memories_fts_idx
memories_fts_idx
ocr_text
ocr_text
pipe_executions
pipe_executions
pipe_scheduler_state
pipe_scheduler_state
secrets
secrets
speaker_embeddings
speaker_embeddings
speakers
speakers
sqlite_sequence
sqlite_sequence
sqlite_stat1
sqlite_stat1
sqlite_stat4
sqlite_stat4
tags
tags
ui_events
ui_events
ui_events_fts (v)
ui_events_fts
(v)
ui_events_fts_config
ui_events_fts_config
ui_events_fts_data
ui_events_fts_data
ui_events_fts_idx
ui_events_fts_idx
video_chunks
video_chunks
vision_tags
vision_tags
Toggle helper tables
Toggle helper tables
Structure
Structure
Content
Content
Query
Query
Export
Export
id
id
audio_chunk_id
audio_chunk_id
offset_index
offset_index
timestamp
timestamp
transcription
transcription
device
device
is_input_device
is_input_device
speaker_id
speaker_id
transcription_engine
transcription_engine
start_time
start_time
end_time
end_time
text_length
text_length
sync_id
sync_id
synced_at
synced_at
redacted_at
redacted_at
51
131
0
2026-05-10T18:35:08+00:00
Does it provide any card?
soundcore AeroClip
True
1
WhisperTiny
28.6313125
29.8631875
26
NULL
NULL
NULL
52
134
0
2026-05-10T18:35:38+00:00
Thank you for watching.
soundcore AeroClip
True
2
WhisperTiny
27.4838125
29.0363125
24
NULL
NULL
NULL
53
136
0
2026-05-10T18:36:08+00:00
Don't ask me, I don't know what you'r
...
...
soundcore AeroClip
True
1
WhisperTiny
3.3019375
26.1844375
129
NULL
NULL
NULL
54
138
0
2026-05-10T18:36:38+00:00
That's not a good answer. That's a good a
...
...
soundcore AeroClip
True
2
WhisperTiny
16.4644375
23.7206875
70
NULL
NULL
NULL
55
138
0
2026-05-10T18:36:38+00:00
Now, the first one is the first one.
soundcore AeroClip
True
1
WhisperTiny
26.6231875
29.4244375
37
NULL
NULL
NULL
56
141
0
2026-05-10T18:37:08+00:00
it can't be by each.
soundcore AeroClip
True
1
WhisperTiny
0.7706875
15.1144375
21
NULL
NULL
NULL
57
143...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"SQLite Web: db.sqlite","depth":4,"bounds":{"left":0.0,"top":0.0518755,"width":0.06881649,"height":0.032721467},"on_screen":true,"help_text":"","role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true},{"role":"AXStaticText","text":"SQLite Web: db.sqlite","depth":5,"bounds":{"left":0.013297873,"top":0.06304868,"width":0.03756649,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXButton","text":"Close tab","depth":5,"bounds":{"left":0.05651596,"top":0.05905826,"width":0.007978723,"height":0.01915403},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXButton","text":"New Tab","depth":4,"bounds":{"left":0.0028257978,"top":0.08619314,"width":0.06333112,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"button","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Customize sidebar","depth":6,"bounds":{"left":0.0028257978,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open Google Gemini (⌃X)","depth":6,"bounds":{"left":0.013796543,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open history (⇧⌘H)","depth":6,"bounds":{"left":0.024933511,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Open bookmarks (⌘B)","depth":6,"bounds":{"left":0.036070477,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXCheckBox","text":"Bitwarden","depth":6,"bounds":{"left":0.04720745,"top":0.97007185,"width":0.010638298,"height":0.025538707},"on_screen":true,"help_text":"","role_description":"toggle button","subrole":"AXToggle","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"sqlite-web 0.7.2","depth":7,"bounds":{"left":0.07413564,"top":0.0,"width":0.043218084,"height":0.030726258},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"sqlite-web 0.7.2","depth":8,"bounds":{"left":0.07413564,"top":0.0,"width":0.043218084,"height":0.017956903},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"db.sqlite","depth":10,"bounds":{"left":0.12267287,"top":0.0,"width":0.023936171,"height":0.029928172},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"db.sqlite","depth":11,"bounds":{"left":0.12533244,"top":0.0,"width":0.01861702,"height":0.01396648},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"audio_transcriptions","depth":10,"bounds":{"left":0.14660904,"top":0.0,"width":0.04454787,"height":0.01396648},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"472 rows, showing page 2","depth":9,"bounds":{"left":0.1924867,"top":0.0,"width":0.05119681,"height":0.012370312},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Query","depth":8,"bounds":{"left":0.47639626,"top":0.0,"width":0.018284574,"height":0.023543496},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Query","depth":9,"bounds":{"left":0.4793883,"top":0.0,"width":0.012300532,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXTextField","text":"table name...","depth":7,"bounds":{"left":0.073803194,"top":0.0,"width":0.061835106,"height":0.023942538},"on_screen":false,"help_text":"","role_description":"text field","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXLink","text":"_sqlx_migrations","depth":9,"bounds":{"left":0.073803194,"top":0.0,"width":0.06200133,"height":0.022745412},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"_sqlx_migrations","depth":10,"bounds":{"left":0.07712766,"top":0.0,"width":0.038896278,"height":0.014764565},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_chunks","depth":9,"bounds":{"left":0.073803194,"top":0.0023942539,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_chunks","depth":10,"bounds":{"left":0.07712766,"top":0.006384677,"width":0.031416222,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_tags","depth":9,"bounds":{"left":0.073803194,"top":0.025139665,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_tags","depth":10,"bounds":{"left":0.07712766,"top":0.029130088,"width":0.025099734,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_transcriptions","depth":9,"bounds":{"left":0.073803194,"top":0.047885075,"width":0.06200133,"height":0.02434158},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_transcriptions","depth":10,"bounds":{"left":0.0774601,"top":0.052673582,"width":0.04654255,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_transcriptions_fts (v)","depth":9,"bounds":{"left":0.073803194,"top":0.07222666,"width":0.06200133,"height":0.040702313},"on_screen":true,"help_text":"audio_transcriptions_fts","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_transcriptions_fts","depth":10,"bounds":{"left":0.07712766,"top":0.07621708,"width":0.055352394,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(v)","depth":11,"bounds":{"left":0.07712766,"top":0.092577815,"width":0.004986702,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_transcriptions_...","depth":9,"bounds":{"left":0.073803194,"top":0.11292897,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"audio_transcriptions_fts_config","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_transcriptions_...","depth":10,"bounds":{"left":0.07712766,"top":0.11691939,"width":0.053523935,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_transcriptions_...","depth":9,"bounds":{"left":0.073803194,"top":0.13567439,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"audio_transcriptions_fts_data","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_transcriptions_...","depth":10,"bounds":{"left":0.07712766,"top":0.1396648,"width":0.053523935,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_transcriptions_...","depth":9,"bounds":{"left":0.073803194,"top":0.15841979,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"audio_transcriptions_fts_idx","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_transcriptions_...","depth":10,"bounds":{"left":0.07712766,"top":0.16241021,"width":0.053523935,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"elements","depth":9,"bounds":{"left":0.073803194,"top":0.1811652,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"elements","depth":10,"bounds":{"left":0.07712766,"top":0.18515563,"width":0.020777926,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"elements_fts (v)","depth":9,"bounds":{"left":0.073803194,"top":0.20391062,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"elements_fts","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"elements_fts","depth":10,"bounds":{"left":0.07712766,"top":0.20790103,"width":0.030917553,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(v)","depth":11,"bounds":{"left":0.10804521,"top":0.20630486,"width":0.004986702,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"elements_fts_config","depth":9,"bounds":{"left":0.073803194,"top":0.22665602,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"elements_fts_config","depth":10,"bounds":{"left":0.07712766,"top":0.23064645,"width":0.04637633,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"elements_fts_data","depth":9,"bounds":{"left":0.073803194,"top":0.24940144,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"elements_fts_data","depth":10,"bounds":{"left":0.07712766,"top":0.25339186,"width":0.042220745,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"elements_fts_idx","depth":9,"bounds":{"left":0.073803194,"top":0.27214685,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"elements_fts_idx","depth":10,"bounds":{"left":0.07712766,"top":0.27613726,"width":0.0390625,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"frames","depth":9,"bounds":{"left":0.073803194,"top":0.29489225,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"frames","depth":10,"bounds":{"left":0.07712766,"top":0.2988827,"width":0.015791224,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"frames_fts (v)","depth":9,"bounds":{"left":0.073803194,"top":0.31763768,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"frames_fts","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"frames_fts","depth":10,"bounds":{"left":0.07712766,"top":0.3216281,"width":0.025930852,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(v)","depth":11,"bounds":{"left":0.10305851,"top":0.3200319,"width":0.004986702,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"frames_fts_config","depth":9,"bounds":{"left":0.073803194,"top":0.34038308,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"frames_fts_config","depth":10,"bounds":{"left":0.07712766,"top":0.3443735,"width":0.04138963,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"frames_fts_data","depth":9,"bounds":{"left":0.073803194,"top":0.36312848,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"frames_fts_data","depth":10,"bounds":{"left":0.07712766,"top":0.36711892,"width":0.03723404,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"frames_fts_idx","depth":9,"bounds":{"left":0.073803194,"top":0.3858739,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"frames_fts_idx","depth":10,"bounds":{"left":0.07712766,"top":0.38986433,"width":0.033909574,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"meetings","depth":9,"bounds":{"left":0.073803194,"top":0.4086193,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"meetings","depth":10,"bounds":{"left":0.07712766,"top":0.41260973,"width":0.020944148,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"memories","depth":9,"bounds":{"left":0.073803194,"top":0.43136472,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"memories","depth":10,"bounds":{"left":0.07712766,"top":0.43535516,"width":0.02244016,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"memories_fts (v)","depth":9,"bounds":{"left":0.073803194,"top":0.45411015,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"memories_fts","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"memories_fts","depth":10,"bounds":{"left":0.07712766,"top":0.45810056,"width":0.032579787,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(v)","depth":11,"bounds":{"left":0.109707445,"top":0.45650437,"width":0.004986702,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"memories_fts_config","depth":9,"bounds":{"left":0.073803194,"top":0.47685555,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"memories_fts_config","depth":10,"bounds":{"left":0.07712766,"top":0.48084596,"width":0.048038565,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"memories_fts_data","depth":9,"bounds":{"left":0.073803194,"top":0.49960095,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"memories_fts_data","depth":10,"bounds":{"left":0.07712766,"top":0.50359136,"width":0.043882977,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"memories_fts_idx","depth":9,"bounds":{"left":0.073803194,"top":0.5223464,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"memories_fts_idx","depth":10,"bounds":{"left":0.07712766,"top":0.5263368,"width":0.04055851,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"ocr_text","depth":9,"bounds":{"left":0.073803194,"top":0.5450918,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"ocr_text","depth":10,"bounds":{"left":0.07712766,"top":0.5490822,"width":0.018783245,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"pipe_executions","depth":9,"bounds":{"left":0.073803194,"top":0.5678372,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"pipe_executions","depth":10,"bounds":{"left":0.07712766,"top":0.5718276,"width":0.036901597,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"pipe_scheduler_state","depth":9,"bounds":{"left":0.073803194,"top":0.5905826,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"pipe_scheduler_state","depth":10,"bounds":{"left":0.07712766,"top":0.594573,"width":0.049035903,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"secrets","depth":9,"bounds":{"left":0.073803194,"top":0.61332804,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"secrets","depth":10,"bounds":{"left":0.07712766,"top":0.61731845,"width":0.016788565,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"speaker_embeddings","depth":9,"bounds":{"left":0.073803194,"top":0.6360734,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"speaker_embeddings","depth":10,"bounds":{"left":0.07712766,"top":0.6400638,"width":0.04886968,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"speakers","depth":9,"bounds":{"left":0.073803194,"top":0.65881884,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"speakers","depth":10,"bounds":{"left":0.07712766,"top":0.66280925,"width":0.020611702,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"sqlite_sequence","depth":9,"bounds":{"left":0.073803194,"top":0.6815643,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"sqlite_sequence","depth":10,"bounds":{"left":0.07712766,"top":0.6855547,"width":0.03723404,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"sqlite_stat1","depth":9,"bounds":{"left":0.073803194,"top":0.70430964,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"sqlite_stat1","depth":10,"bounds":{"left":0.07712766,"top":0.70830005,"width":0.025930852,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"sqlite_stat4","depth":9,"bounds":{"left":0.073803194,"top":0.7270551,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"sqlite_stat4","depth":10,"bounds":{"left":0.07712766,"top":0.7310455,"width":0.026595745,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"tags","depth":9,"bounds":{"left":0.073803194,"top":0.7498005,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"tags","depth":10,"bounds":{"left":0.07712766,"top":0.7537909,"width":0.009973404,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"ui_events","depth":9,"bounds":{"left":0.073803194,"top":0.7725459,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"ui_events","depth":10,"bounds":{"left":0.07712766,"top":0.7765363,"width":0.021775266,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"ui_events_fts (v)","depth":9,"bounds":{"left":0.073803194,"top":0.7952913,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"ui_events_fts","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"ui_events_fts","depth":10,"bounds":{"left":0.07712766,"top":0.7992817,"width":0.031914894,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"(v)","depth":11,"bounds":{"left":0.109042555,"top":0.79768556,"width":0.0048204786,"height":0.010774142},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"ui_events_fts_config","depth":9,"bounds":{"left":0.073803194,"top":0.81803674,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"ui_events_fts_config","depth":10,"bounds":{"left":0.07712766,"top":0.82202715,"width":0.04737367,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"ui_events_fts_data","depth":9,"bounds":{"left":0.073803194,"top":0.8407821,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"ui_events_fts_data","depth":10,"bounds":{"left":0.07712766,"top":0.8447725,"width":0.043218084,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"ui_events_fts_idx","depth":9,"bounds":{"left":0.073803194,"top":0.86352754,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"ui_events_fts_idx","depth":10,"bounds":{"left":0.07712766,"top":0.86751795,"width":0.039893616,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"video_chunks","depth":9,"bounds":{"left":0.073803194,"top":0.88627297,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"video_chunks","depth":10,"bounds":{"left":0.07712766,"top":0.8902634,"width":0.03125,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"vision_tags","depth":9,"bounds":{"left":0.073803194,"top":0.90901834,"width":0.06200133,"height":0.022745412},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"vision_tags","depth":10,"bounds":{"left":0.07712766,"top":0.91300875,"width":0.025930852,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Toggle helper tables","depth":8,"bounds":{"left":0.073803194,"top":0.9596967,"width":0.04637633,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Toggle helper tables","depth":9,"bounds":{"left":0.073803194,"top":0.9596967,"width":0.04637633,"height":0.014764565},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Structure","depth":9,"bounds":{"left":0.1456117,"top":0.0,"width":0.03274601,"height":0.032322425},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Structure","depth":10,"bounds":{"left":0.1512633,"top":0.0,"width":0.02144282,"height":0.014764565},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Content","depth":9,"bounds":{"left":0.1783577,"top":0.0,"width":0.029421542,"height":0.032322425},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Content","depth":10,"bounds":{"left":0.18400931,"top":0.0,"width":0.018118352,"height":0.014764565},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Query","depth":9,"bounds":{"left":0.20777926,"top":0.0,"width":0.025265958,"height":0.032322425},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Query","depth":10,"bounds":{"left":0.21343085,"top":0.0,"width":0.013962766,"height":0.014764565},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"Export","depth":9,"bounds":{"left":0.2330452,"top":0.0,"width":0.026097074,"height":0.032322425},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"Export","depth":10,"bounds":{"left":0.23869681,"top":0.0,"width":0.014793883,"height":0.014764565},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"id","depth":10,"bounds":{"left":0.14727394,"top":0.0071827616,"width":0.004155585,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"id","depth":11,"bounds":{"left":0.14727394,"top":0.0071827616,"width":0.004155585,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"audio_chunk_id","depth":10,"bounds":{"left":0.15774602,"top":0.0071827616,"width":0.034408245,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"audio_chunk_id","depth":11,"bounds":{"left":0.15774602,"top":0.0071827616,"width":0.034408245,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"offset_index","depth":10,"bounds":{"left":0.19514628,"top":0.0071827616,"width":0.027426861,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"offset_index","depth":11,"bounds":{"left":0.19514628,"top":0.0071827616,"width":0.027426861,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"timestamp","depth":10,"bounds":{"left":0.22556517,"top":0.0071827616,"width":0.023271276,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"timestamp","depth":11,"bounds":{"left":0.22556517,"top":0.0071827616,"width":0.023271276,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"transcription","depth":10,"bounds":{"left":0.28922874,"top":0.0071827616,"width":0.028091755,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"transcription","depth":11,"bounds":{"left":0.28922874,"top":0.0071827616,"width":0.028091755,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"device","depth":10,"bounds":{"left":0.3203125,"top":0.0071827616,"width":0.014295213,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"device","depth":11,"bounds":{"left":0.3203125,"top":0.0071827616,"width":0.014295213,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"is_input_device","depth":10,"bounds":{"left":0.34740692,"top":0.0071827616,"width":0.034408245,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"is_input_device","depth":11,"bounds":{"left":0.34740692,"top":0.0071827616,"width":0.034408245,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"speaker_id","depth":10,"bounds":{"left":0.38480717,"top":0.0071827616,"width":0.023769947,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"speaker_id","depth":11,"bounds":{"left":0.38480717,"top":0.0071827616,"width":0.023769947,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"transcription_engine","depth":10,"bounds":{"left":0.41156915,"top":0.0071827616,"width":0.04537899,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"transcription_engine","depth":11,"bounds":{"left":0.41156915,"top":0.0071827616,"width":0.04537899,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"start_time","depth":10,"bounds":{"left":0.45994017,"top":0.0071827616,"width":0.022772606,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"start_time","depth":11,"bounds":{"left":0.45994017,"top":0.0071827616,"width":0.022772606,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"end_time","depth":10,"bounds":{"left":0.48819813,"top":0.0071827616,"width":0.02044548,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"end_time","depth":11,"bounds":{"left":0.48819813,"top":0.0071827616,"width":0.02044548,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"text_length","depth":10,"bounds":{"left":0.5162899,"top":0.0071827616,"width":0.025099734,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"text_length","depth":11,"bounds":{"left":0.5162899,"top":0.0071827616,"width":0.025099734,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"sync_id","depth":10,"bounds":{"left":0.5443817,"top":0.0071827616,"width":0.016954787,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"sync_id","depth":11,"bounds":{"left":0.5443817,"top":0.0071827616,"width":0.016954787,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"synced_at","depth":10,"bounds":{"left":0.56432843,"top":0.0071827616,"width":0.022606382,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"synced_at","depth":11,"bounds":{"left":0.56432843,"top":0.0071827616,"width":0.022606382,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"redacted_at","depth":10,"bounds":{"left":0.58992684,"top":0.0071827616,"width":0.02642952,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"redacted_at","depth":11,"bounds":{"left":0.58992684,"top":0.0071827616,"width":0.02642952,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"51","depth":10,"bounds":{"left":0.14727394,"top":0.025139665,"width":0.004654255,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"131","depth":10,"bounds":{"left":0.15774602,"top":0.025139665,"width":0.0066489363,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":10,"bounds":{"left":0.19514628,"top":0.025139665,"width":0.0028257978,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026-05-10T18:35:08+00:00","depth":10,"bounds":{"left":0.22556517,"top":0.025139665,"width":0.06050532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Does it provide any card?","depth":10,"bounds":{"left":0.28922874,"top":0.025139665,"width":0.023603724,"height":0.044293694},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip","depth":10,"bounds":{"left":0.3203125,"top":0.025139665,"width":0.021609042,"height":0.028332002},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":10,"bounds":{"left":0.34740692,"top":0.025139665,"width":0.009142287,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1","depth":10,"bounds":{"left":0.38480717,"top":0.025139665,"width":0.0019946808,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"WhisperTiny","depth":10,"bounds":{"left":0.41156915,"top":0.025139665,"width":0.025099734,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"28.6313125","depth":10,"bounds":{"left":0.45994017,"top":0.025139665,"width":0.023769947,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"29.8631875","depth":10,"bounds":{"left":0.48819813,"top":0.025139665,"width":0.02443484,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"26","depth":10,"bounds":{"left":0.5162899,"top":0.025139665,"width":0.005319149,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.5443817,"top":0.026735835,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.56432843,"top":0.026735835,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.58992684,"top":0.026735835,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"52","depth":10,"bounds":{"left":0.14727394,"top":0.07462091,"width":0.005319149,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"134","depth":10,"bounds":{"left":0.15774602,"top":0.07462091,"width":0.007480053,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":10,"bounds":{"left":0.19514628,"top":0.07462091,"width":0.0028257978,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026-05-10T18:35:38+00:00","depth":10,"bounds":{"left":0.22556517,"top":0.07462091,"width":0.06050532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Thank you for watching.","depth":10,"bounds":{"left":0.28922874,"top":0.07462091,"width":0.027925532,"height":0.028731046},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip","depth":10,"bounds":{"left":0.3203125,"top":0.07462091,"width":0.021609042,"height":0.028731046},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":10,"bounds":{"left":0.34740692,"top":0.07462091,"width":0.009142287,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2","depth":10,"bounds":{"left":0.38480717,"top":0.07462091,"width":0.0026595744,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"WhisperTiny","depth":10,"bounds":{"left":0.41156915,"top":0.07462091,"width":0.025099734,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"27.4838125","depth":10,"bounds":{"left":0.45994017,"top":0.07462091,"width":0.024102394,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"29.0363125","depth":10,"bounds":{"left":0.48819813,"top":0.07462091,"width":0.02443484,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"24","depth":10,"bounds":{"left":0.5162899,"top":0.07462091,"width":0.005319149,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.5443817,"top":0.07621708,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.56432843,"top":0.07621708,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.58992684,"top":0.07621708,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"53","depth":10,"bounds":{"left":0.14727394,"top":0.10853951,"width":0.005485372,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"136","depth":10,"bounds":{"left":0.15774602,"top":0.10853951,"width":0.007480053,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":10,"bounds":{"left":0.19514628,"top":0.10853951,"width":0.0028257978,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026-05-10T18:36:08+00:00","depth":10,"bounds":{"left":0.22556517,"top":0.10853951,"width":0.06067154,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Don't ask me, I don't know what you'r","depth":10,"bounds":{"left":0.28922874,"top":0.10853951,"width":0.028091755,"height":0.044293694},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"...","depth":10,"bounds":{"left":0.3118351,"top":0.14006385,"width":0.0038231383,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"...","depth":11,"bounds":{"left":0.3118351,"top":0.14006385,"width":0.0038231383,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip","depth":10,"bounds":{"left":0.3203125,"top":0.10853951,"width":0.021609042,"height":0.028332002},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":10,"bounds":{"left":0.34740692,"top":0.10853951,"width":0.009142287,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1","depth":10,"bounds":{"left":0.38480717,"top":0.10853951,"width":0.0019946808,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"WhisperTiny","depth":10,"bounds":{"left":0.41156915,"top":0.10853951,"width":0.025099734,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"3.3019375","depth":10,"bounds":{"left":0.45994017,"top":0.10853951,"width":0.021609042,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"26.1844375","depth":10,"bounds":{"left":0.48819813,"top":0.10853951,"width":0.02443484,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"129","depth":10,"bounds":{"left":0.5162899,"top":0.10853951,"width":0.00731383,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.5443817,"top":0.110135674,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.56432843,"top":0.110135674,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.58992684,"top":0.110135674,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"54","depth":10,"bounds":{"left":0.14727394,"top":0.15802075,"width":0.005485372,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"138","depth":10,"bounds":{"left":0.15774602,"top":0.15802075,"width":0.007480053,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":10,"bounds":{"left":0.19514628,"top":0.15802075,"width":0.0028257978,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026-05-10T18:36:38+00:00","depth":10,"bounds":{"left":0.22556517,"top":0.15802075,"width":0.06050532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"That's not a good answer. That's a good a","depth":10,"bounds":{"left":0.28922874,"top":0.15802075,"width":0.027925532,"height":0.059856344},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXLink","text":"...","depth":10,"bounds":{"left":0.29288563,"top":0.20510775,"width":0.0038231383,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"link","subrole":"AXUnknown","is_enabled":true,"is_focused":false,"is_selected":false},{"role":"AXStaticText","text":"...","depth":11,"bounds":{"left":0.29288563,"top":0.20510775,"width":0.0038231383,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip","depth":10,"bounds":{"left":0.3203125,"top":0.15802075,"width":0.021609042,"height":0.028332002},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":10,"bounds":{"left":0.34740692,"top":0.15802075,"width":0.009142287,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2","depth":10,"bounds":{"left":0.38480717,"top":0.15802075,"width":0.0026595744,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"WhisperTiny","depth":10,"bounds":{"left":0.41156915,"top":0.15802075,"width":0.025099734,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"16.4644375","depth":10,"bounds":{"left":0.45994017,"top":0.15802075,"width":0.024601065,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"23.7206875","depth":10,"bounds":{"left":0.48819813,"top":0.15802075,"width":0.024601065,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"70","depth":10,"bounds":{"left":0.5162899,"top":0.15802075,"width":0.0051529254,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.5443817,"top":0.15961692,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.56432843,"top":0.15961692,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.58992684,"top":0.15961692,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"55","depth":10,"bounds":{"left":0.14727394,"top":0.22306465,"width":0.005319149,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"138","depth":10,"bounds":{"left":0.15774602,"top":0.22306465,"width":0.007480053,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":10,"bounds":{"left":0.19514628,"top":0.22306465,"width":0.0028257978,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026-05-10T18:36:38+00:00","depth":10,"bounds":{"left":0.22556517,"top":0.22306465,"width":0.06050532,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"Now, the first one is the first one.","depth":10,"bounds":{"left":0.28922874,"top":0.22306465,"width":0.027094414,"height":0.044293694},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip","depth":10,"bounds":{"left":0.3203125,"top":0.22306465,"width":0.021609042,"height":0.028731046},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":10,"bounds":{"left":0.34740692,"top":0.22306465,"width":0.009142287,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1","depth":10,"bounds":{"left":0.38480717,"top":0.22306465,"width":0.0019946808,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"WhisperTiny","depth":10,"bounds":{"left":0.41156915,"top":0.22306465,"width":0.025099734,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"26.6231875","depth":10,"bounds":{"left":0.45994017,"top":0.22306465,"width":0.024268618,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"29.4244375","depth":10,"bounds":{"left":0.48819813,"top":0.22306465,"width":0.024933511,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"37","depth":10,"bounds":{"left":0.5162899,"top":0.22306465,"width":0.0051529254,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.5443817,"top":0.22466081,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.56432843,"top":0.22466081,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.58992684,"top":0.22466081,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"56","depth":10,"bounds":{"left":0.14727394,"top":0.27294493,"width":0.005485372,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"141","depth":10,"bounds":{"left":0.15774602,"top":0.27294493,"width":0.0068151597,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0","depth":10,"bounds":{"left":0.19514628,"top":0.27294493,"width":0.0028257978,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"2026-05-10T18:37:08+00:00","depth":10,"bounds":{"left":0.22556517,"top":0.27294493,"width":0.060339097,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"it can't be by each.","depth":10,"bounds":{"left":0.28922874,"top":0.27294493,"width":0.02642952,"height":0.028332002},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"soundcore AeroClip","depth":10,"bounds":{"left":0.3203125,"top":0.27294493,"width":0.021609042,"height":0.028332002},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"True","depth":10,"bounds":{"left":0.34740692,"top":0.27294493,"width":0.009142287,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"1","depth":10,"bounds":{"left":0.38480717,"top":0.27294493,"width":0.0019946808,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"WhisperTiny","depth":10,"bounds":{"left":0.41156915,"top":0.27294493,"width":0.025099734,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"0.7706875","depth":10,"bounds":{"left":0.45994017,"top":0.27294493,"width":0.021941489,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"15.1144375","depth":10,"bounds":{"left":0.48819813,"top":0.27294493,"width":0.022938829,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"21","depth":10,"bounds":{"left":0.5162899,"top":0.27294493,"width":0.004654255,"height":0.012769354},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.5443817,"top":0.2745411,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.56432843,"top":0.2745411,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"NULL","depth":11,"bounds":{"left":0.58992684,"top":0.2745411,"width":0.009142287,"height":0.011173184},"on_screen":false,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"57","depth":10,"bounds":{"left":0.14727394,"top":0.3064645,"width":0.0051529254,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"},{"role":"AXStaticText","text":"143","depth":10,"bounds":{"left":0.15774602,"top":0.3064645,"width":0.007480053,"height":0.012769354},"on_screen":true,"help_text":"","role_description":"text","subrole":"AXUnknown"}]...
|
5518951372379966754
|
9192404088752805728
|
click
|
accessibility
|
NULL
|
SQLite Web: db.sqlite
SQLite Web: db.sqlite
Close SQLite Web: db.sqlite
SQLite Web: db.sqlite
Close tab
New Tab
Customize sidebar
Open Google Gemini (⌃X)
Open history (⇧⌘H)
Open bookmarks (⌘B)
Bitwarden
sqlite-web 0.7.2
sqlite-web 0.7.2
db.sqlite
db.sqlite
audio_transcriptions
472 rows, showing page 2
Query
Query
table name...
_sqlx_migrations
_sqlx_migrations
audio_chunks
audio_chunks
audio_tags
audio_tags
audio_transcriptions
audio_transcriptions
audio_transcriptions_fts (v)
audio_transcriptions_fts
(v)
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
audio_transcriptions_...
elements
elements
elements_fts (v)
elements_fts
(v)
elements_fts_config
elements_fts_config
elements_fts_data
elements_fts_data
elements_fts_idx
elements_fts_idx
frames
frames
frames_fts (v)
frames_fts
(v)
frames_fts_config
frames_fts_config
frames_fts_data
frames_fts_data
frames_fts_idx
frames_fts_idx
meetings
meetings
memories
memories
memories_fts (v)
memories_fts
(v)
memories_fts_config
memories_fts_config
memories_fts_data
memories_fts_data
memories_fts_idx
memories_fts_idx
ocr_text
ocr_text
pipe_executions
pipe_executions
pipe_scheduler_state
pipe_scheduler_state
secrets
secrets
speaker_embeddings
speaker_embeddings
speakers
speakers
sqlite_sequence
sqlite_sequence
sqlite_stat1
sqlite_stat1
sqlite_stat4
sqlite_stat4
tags
tags
ui_events
ui_events
ui_events_fts (v)
ui_events_fts
(v)
ui_events_fts_config
ui_events_fts_config
ui_events_fts_data
ui_events_fts_data
ui_events_fts_idx
ui_events_fts_idx
video_chunks
video_chunks
vision_tags
vision_tags
Toggle helper tables
Toggle helper tables
Structure
Structure
Content
Content
Query
Query
Export
Export
id
id
audio_chunk_id
audio_chunk_id
offset_index
offset_index
timestamp
timestamp
transcription
transcription
device
device
is_input_device
is_input_device
speaker_id
speaker_id
transcription_engine
transcription_engine
start_time
start_time
end_time
end_time
text_length
text_length
sync_id
sync_id
synced_at
synced_at
redacted_at
redacted_at
51
131
0
2026-05-10T18:35:08+00:00
Does it provide any card?
soundcore AeroClip
True
1
WhisperTiny
28.6313125
29.8631875
26
NULL
NULL
NULL
52
134
0
2026-05-10T18:35:38+00:00
Thank you for watching.
soundcore AeroClip
True
2
WhisperTiny
27.4838125
29.0363125
24
NULL
NULL
NULL
53
136
0
2026-05-10T18:36:08+00:00
Don't ask me, I don't know what you'r
...
...
soundcore AeroClip
True
1
WhisperTiny
3.3019375
26.1844375
129
NULL
NULL
NULL
54
138
0
2026-05-10T18:36:38+00:00
That's not a good answer. That's a good a
...
...
soundcore AeroClip
True
2
WhisperTiny
16.4644375
23.7206875
70
NULL
NULL
NULL
55
138
0
2026-05-10T18:36:38+00:00
Now, the first one is the first one.
soundcore AeroClip
True
1
WhisperTiny
26.6231875
29.4244375
37
NULL
NULL
NULL
56
141
0
2026-05-10T18:37:08+00:00
it can't be by each.
soundcore AeroClip
True
1
WhisperTiny
0.7706875
15.1144375
21
NULL
NULL
NULL
57
143...
|
29413
|
NULL
|
NULL
|
NULL
|
|
31094
|
1224
|
6
|
2026-05-13T08:20:43.171934+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778660443171_m2.jpg...
|
Windsurf
|
screenpipe [SSH: nas] — screenpipe_sync_helpers.sh screenpipe [SSH: nas] — screenpipe_sync_helpers.sh — Untracked...
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 19 pending changes
19
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
lib
screenpipe_sync_db.sh
U
screenpipe_sync_files.sh
U
screenpipe_sync_helpers.sh
U
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
U
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153, 45375
4
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 1, Col 1
Screen Reader Optimized
Info: Setting up SSH Host (details): Creating local forwarding server...
Clear
Refactor Sync Script
15h
2h
1m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State
archive.db
- 12.9 GB (main archive)...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⌥⌘E)","depth":18,"bounds":{"left":0.0,"top":0.047885075,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.05586592,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":18,"bounds":{"left":0.0,"top":0.07581804,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.083798885,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⇧⌘G) - 19 pending changes","depth":18,"bounds":{"left":0.0,"top":0.103751,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.11173184,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"19","depth":21,"bounds":{"left":0.005319149,"top":0.11811652,"width":0.0033244682,"height":0.007980846},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.0056515955,"top":0.118914604,"width":0.0013297872,"height":0.0071827616}},{"char_start":1,"char_count":1,"bounds":{"left":0.006981383,"top":0.118914604,"width":0.0016622341,"height":0.0071827616}}],"role_description":"text"},{"role":"AXRadioButton","text":"Codemaps","depth":18,"bounds":{"left":0.0,"top":0.13168396,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.1396648,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"DeepWiki","depth":18,"bounds":{"left":0.0,"top":0.15961692,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"Run and Debug","depth":18,"bounds":{"left":0.0,"top":0.18754987,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.19553073,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":18,"bounds":{"left":0.0,"top":0.21548285,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.22346368,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X)","depth":18,"bounds":{"left":0.0,"top":0.2434158,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.25139666,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer","depth":17,"bounds":{"left":0.01462766,"top":0.047885075,"width":0.013630319,"height":0.023144454},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Explorer","depth":18,"bounds":{"left":0.01462766,"top":0.054269753,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.014960106,"top":0.055067837,"width":0.0019946808,"height":0.008778931}},{"char_start":1,"char_count":7,"bounds":{"left":0.016954787,"top":0.055067837,"width":0.011303191,"height":0.008778931}}],"role_description":"text"},{"role":"AXButton","text":"Explorer Section: screenpipe [SSH: nas]","depth":21,"bounds":{"left":0.011635638,"top":0.07102953,"width":0.0831117,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.0726257,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: screenpipe [SSH: nas]","depth":22,"bounds":{"left":0.016954787,"top":0.07102953,"width":0.035904255,"height":0.014365523},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"screenpipe [SSH: nas]","depth":23,"bounds":{"left":0.016954787,"top":0.07342378,"width":0.035904255,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.07342378,"width":0.0019946808,"height":0.009577015}},{"char_start":1,"char_count":20,"bounds":{"left":0.018949468,"top":0.07342378,"width":0.034242023,"height":0.009577015}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.08699122,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.08699122,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"#recycle","depth":27,"bounds":{"left":0.025930852,"top":0.08699122,"width":0.01462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.087789305,"width":0.0026595744,"height":0.0103751}},{"char_start":1,"char_count":7,"bounds":{"left":0.02825798,"top":0.087789305,"width":0.012300532,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.10215483,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.10215483,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app","depth":27,"bounds":{"left":0.025930852,"top":0.10215483,"width":0.0063164895,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.10215483,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":2,"bounds":{"left":0.027925532,"top":0.10215483,"width":0.004654255,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.08676862,"top":0.10215483,"width":0.0039893617,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.11652035,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.11652035,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"data","depth":27,"bounds":{"left":0.025930852,"top":0.11652035,"width":0.0076462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.11731844,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":3,"bounds":{"left":0.02825798,"top":0.11731844,"width":0.005319149,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.01462766,"top":0.13088587,"width":0.0043218085,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.13088587,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"lib","depth":27,"bounds":{"left":0.025930852,"top":0.13088587,"width":0.0039893617,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.13168396,"width":0.0009973404,"height":0.0103751}},{"char_start":1,"char_count":2,"bounds":{"left":0.026928192,"top":0.13168396,"width":0.0033244682,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.08676862,"top":0.13168396,"width":0.0039893617,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.021941489,"top":0.14604948,"width":0.003656915,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_db.sh","depth":27,"bounds":{"left":0.027925532,"top":0.14604948,"width":0.039893616,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.02825798,"top":0.14604948,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":20,"bounds":{"left":0.029920213,"top":0.14604948,"width":0.037898935,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.14604948,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.021941489,"top":0.16041501,"width":0.003656915,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_files.sh","depth":27,"bounds":{"left":0.027925532,"top":0.16041501,"width":0.04255319,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.02825798,"top":0.16121309,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":23,"bounds":{"left":0.029920213,"top":0.16121309,"width":0.04089096,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.16121309,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.021941489,"top":0.17478053,"width":0.003656915,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_helpers.sh","depth":27,"bounds":{"left":0.027925532,"top":0.17478053,"width":0.048204787,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.02825798,"top":0.17557861,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":25,"bounds":{"left":0.029920213,"top":0.17557861,"width":0.046210106,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.17557861,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.18994413,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.18994413,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"logs","depth":27,"bounds":{"left":0.025930852,"top":0.18994413,"width":0.006981383,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.18994413,"width":0.0009973404,"height":0.011173184}},{"char_start":1,"char_count":3,"bounds":{"left":0.026928192,"top":0.18994413,"width":0.0063164895,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.08676862,"top":0.18994413,"width":0.0039893617,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.20430966,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.20430966,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"pipes","depth":27,"bounds":{"left":0.025930852,"top":0.20430966,"width":0.00930851,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.20510775,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":4,"bounds":{"left":0.02825798,"top":0.20510775,"width":0.006981383,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.21867518,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":".gitignore","depth":27,"bounds":{"left":0.025930852,"top":0.21867518,"width":0.015957447,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.21947326,"width":0.0013297872,"height":0.0103751}},{"char_start":1,"char_count":9,"bounds":{"left":0.026928192,"top":0.21947326,"width":0.014960106,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.23383878,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app_settings.json","depth":27,"bounds":{"left":0.025930852,"top":0.23383878,"width":0.029920213,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.23383878,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":16,"bounds":{"left":0.027925532,"top":0.23383878,"width":0.027925532,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.2482043,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":27,"bounds":{"left":0.025930852,"top":0.2482043,"width":0.01761968,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.2490024,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":9,"bounds":{"left":0.027925532,"top":0.2490024,"width":0.015625,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.26256984,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db-bak","depth":27,"bounds":{"left":0.025930852,"top":0.26256984,"width":0.025265958,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.26336792,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":13,"bounds":{"left":0.027925532,"top":0.26336792,"width":0.023603724,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.26336792,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.27773345,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db.bak-pre-installid","depth":27,"bounds":{"left":0.025930852,"top":0.27773345,"width":0.046210106,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.27773345,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":27,"bounds":{"left":0.027925532,"top":0.27773345,"width":0.04454787,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.27773345,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.29209897,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite","depth":27,"bounds":{"left":0.025930852,"top":0.29209897,"width":0.01462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.29289705,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":8,"bounds":{"left":0.02825798,"top":0.29289705,"width":0.012300532,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.3064645,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-shm","depth":27,"bounds":{"left":0.025930852,"top":0.3064645,"width":0.023271276,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.30726257,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":12,"bounds":{"left":0.02825798,"top":0.30726257,"width":0.021276595,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.3216281,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-wal","depth":27,"bounds":{"left":0.025930852,"top":0.3216281,"width":0.021941489,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.3216281,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":12,"bounds":{"left":0.02825798,"top":0.3216281,"width":0.019614361,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.33599362,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"bounds":{"left":0.025930852,"top":0.33599362,"width":0.04488032,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.3367917,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":24,"bounds":{"left":0.027925532,"top":0.3367917,"width":0.04288564,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.3367917,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.35035914,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync copy.sh","depth":27,"bounds":{"left":0.025930852,"top":0.35035914,"width":0.042220745,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.35115722,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":22,"bounds":{"left":0.027925532,"top":0.35115722,"width":0.04055851,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.35115722,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.36552274,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_old.sh","depth":27,"bounds":{"left":0.025930852,"top":0.36552274,"width":0.04055851,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.36552274,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":21,"bounds":{"left":0.027925532,"top":0.36552274,"width":0.03856383,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.36552274,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.37988827,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_updated.sh","depth":27,"bounds":{"left":0.025930852,"top":0.37988827,"width":0.04920213,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.38068634,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":25,"bounds":{"left":0.027925532,"top":0.38068634,"width":0.047539894,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.38068634,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.3942538,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"bounds":{"left":0.025930852,"top":0.3942538,"width":0.03357713,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.39505187,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":17,"bounds":{"left":0.027925532,"top":0.39505187,"width":0.03158245,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"bounds":{"left":0.087101065,"top":0.39505187,"width":0.0033244682,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.4094174,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe.db","depth":27,"bounds":{"left":0.025930852,"top":0.4094174,"width":0.023936171,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.4094174,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":12,"bounds":{"left":0.027925532,"top":0.4094174,"width":0.022273935,"height":0.011173184}}],"role_description":"text"},{"role":"AXButton","text":"Outline Section","depth":21,"bounds":{"left":0.011635638,"top":0.95530725,"width":0.0831117,"height":0.015163607},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.95690346,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Outline","depth":22,"bounds":{"left":0.016954787,"top":0.95530725,"width":0.011635638,"height":0.015163607},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Outline","depth":23,"bounds":{"left":0.016954787,"top":0.9577015,"width":0.011635638,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.9584996,"width":0.0026595744,"height":0.009577015}},{"char_start":1,"char_count":6,"bounds":{"left":0.019946808,"top":0.9584996,"width":0.008976064,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"Timeline Section","depth":21,"bounds":{"left":0.011635638,"top":0.9696728,"width":0.0831117,"height":0.015163607},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.97206706,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Timeline","depth":22,"bounds":{"left":0.016954787,"top":0.97047085,"width":0.013630319,"height":0.014365523},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Timeline","depth":23,"bounds":{"left":0.016954787,"top":0.9728651,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.9728651,"width":0.0023271276,"height":0.009577015}},{"char_start":1,"char_count":7,"bounds":{"left":0.019281914,"top":0.9728651,"width":0.011635638,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"remote SSH: nas","depth":16,"bounds":{"left":0.0016622341,"top":0.9848364,"width":0.024268618,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.0039893617,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SSH: nas","depth":17,"bounds":{"left":0.00831117,"top":0.98723066,"width":0.015292553,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.008643617,"top":0.98723066,"width":0.0009973404,"height":0.009577015}},{"char_start":1,"char_count":7,"bounds":{"left":0.009640957,"top":0.98723066,"width":0.012300532,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - master*, Checkout Branch/Tag...","depth":16,"bounds":{"left":0.027260639,"top":0.9848364,"width":0.019281914,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.027925532,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"master*","depth":17,"bounds":{"left":0.032247342,"top":0.98723066,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.032579787,"top":0.98723066,"width":0.0009973404,"height":0.009577015}},{"char_start":1,"char_count":6,"bounds":{"left":0.03357713,"top":0.98723066,"width":0.010970744,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - Synchronize Changes","depth":16,"bounds":{"left":0.04654255,"top":0.9848364,"width":0.0063164895,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"No Problems","depth":16,"bounds":{"left":0.054853722,"top":0.9848364,"width":0.01861702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.05618351,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"bounds":{"left":0.06050532,"top":0.98723066,"width":0.0043218085,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.064494684,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"bounds":{"left":0.069148935,"top":0.98723066,"width":0.0029920214,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Forwarded Ports: 41257, 36613, 33153, 45375","depth":16,"bounds":{"left":0.07513298,"top":0.9848364,"width":0.010305851,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.07646277,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"4","depth":17,"bounds":{"left":0.080784574,"top":0.98723066,"width":0.0033244682,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Notifications","depth":16,"bounds":{"left":0.99102396,"top":0.9848364,"width":0.008976042,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Windsurf - Settings","depth":16,"bounds":{"left":0.9567819,"top":0.9848364,"width":0.03357713,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Teams, Daily: 0% quota used · Weekly: 68% quota used","depth":16,"bounds":{"left":0.9421542,"top":0.9848364,"width":0.012965426,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Shell Script","depth":16,"bounds":{"left":0.91988033,"top":0.9848364,"width":0.020611702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"LF","depth":16,"bounds":{"left":0.9115692,"top":0.9848364,"width":0.0066489363,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"UTF-8","depth":16,"bounds":{"left":0.8969415,"top":0.9848364,"width":0.013297873,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Spaces: 2","depth":16,"bounds":{"left":0.87699467,"top":0.9848364,"width":0.01861702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Ln 1, Col 1","depth":16,"bounds":{"left":0.8557181,"top":0.9848364,"width":0.019946808,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Screen Reader Optimized","depth":16,"bounds":{"left":0.81050533,"top":0.9848364,"width":0.04454787,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Info: Setting up SSH Host (details): Creating local forwarding server...","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Clear","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Refactor Sync Script","depth":20,"bounds":{"left":0.7659575,"top":0.05347167,"width":0.034574468,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.7659575,"top":0.054269753,"width":0.0026595744,"height":0.0103751}},{"char_start":1,"char_count":19,"bounds":{"left":0.76828456,"top":0.054269753,"width":0.032247342,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"15h","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"2h","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"1m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using","depth":25,"bounds":{"left":0.7659575,"top":0.07102953,"width":0.2044548,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"bounds":{"left":0.76662236,"top":0.07102953,"width":0.0033244682,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"bounds":{"left":0.7699468,"top":0.07102953,"width":0.04288564,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"and then run","depth":25,"bounds":{"left":0.81349736,"top":0.07102953,"width":0.023603724,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"bounds":{"left":0.83776593,"top":0.07102953,"width":0.0029920214,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"bounds":{"left":0.84075797,"top":0.07102953,"width":0.032247342,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for a day to move the data. Review and see is you understand everything.","depth":25,"bounds":{"left":0.7659575,"top":0.07102953,"width":0.21243352,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Expand content to scroll","depth":23,"bounds":{"left":0.87333775,"top":0.07102953,"width":0.012300532,"height":0.0007980846},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Thought","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.014295213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 1s","depth":21,"bounds":{"left":0.77859044,"top":0.07102953,"width":0.00930851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I'll read the two scripts and explore the project structure to understand the data migration setup.","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.17220744,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thoughts","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.015957447,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Read","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.008643617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh and screenpipe_sync.sh","depth":20,"bounds":{"left":0.77293885,"top":0.07102953,"width":0.08643617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Analyzed","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.015625,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.77958775,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/volume1/screenpipe","depth":22,"bounds":{"left":0.78523934,"top":0.07102953,"width":0.038231384,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thought","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.014295213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 2s","depth":21,"bounds":{"left":0.77859044,"top":0.07102953,"width":0.009640957,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I understand the setup. Here's my analysis:","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.07712766,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Project Overview","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.23238032,"height":0.0007980846},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Project Overview","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.033909574,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"You're migrating Screenpipe data from Mac to NAS. The workflow:","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.118351065,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"1.","depth":22,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.0039893617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.76728725,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":24,"bounds":{"left":0.77293885,"top":0.07102953,"width":0.053856384,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- One-time database schema migration:","depth":21,"bounds":{"left":0.8267952,"top":0.07102953,"width":0.07247341,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Adds","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.010305851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"bounds":{"left":0.78224736,"top":0.07102953,"width":0.01861702,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"column to all synced tables (prevents ID collisions across reinstalls)","depth":22,"bounds":{"left":0.8018617,"top":0.07102953,"width":0.12134308,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Creates","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.015292553,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"_installs","depth":23,"bounds":{"left":0.7869016,"top":0.07102953,"width":0.016954787,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"registry table","depth":22,"bounds":{"left":0.80452126,"top":0.07102953,"width":0.024933511,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Rebuilds FTS (full-text search) tables with new schema","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.09840426,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Optional but recommended","depth":23,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.05119681,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- running this on the NAS is 5-10x faster than letting","depth":22,"bounds":{"left":0.82214093,"top":0.07102953,"width":0.09541223,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.9172208,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"bounds":{"left":0.92287236,"top":0.07102953,"width":0.038896278,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"do it inline","depth":22,"bounds":{"left":0.9617686,"top":0.07102953,"width":0.019614361,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Idempotent (safe to re-run)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.04920213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2.","depth":22,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.004654255,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.76795214,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"bounds":{"left":0.77327126,"top":0.07102953,"width":0.039228722,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- Daily sync script:","depth":21,"bounds":{"left":0.8121675,"top":0.07102953,"width":0.034574468,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies SQLite data from Mac (","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.05518617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"~/.screenpipe/db.sqlite","depth":23,"bounds":{"left":0.8267952,"top":0.07102953,"width":0.04255319,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":") to NAS (","depth":22,"bounds":{"left":0.8703458,"top":0.07102953,"width":0.01761968,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/Volumes/screenpipe/archive.db","depth":23,"bounds":{"left":0.88896275,"top":0.07102953,"width":0.05518617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":")","depth":22,"bounds":{"left":0.94514626,"top":0.07102953,"width":0.0013297872,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.19082446,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies frame data files and audio MP4s via rsync","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.08809841,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Uses","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.010305851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"bounds":{"left":0.78224736,"top":0.07102953,"width":0.018284574,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(UUID) to handle multiple screenpipe installations safely","depth":22,"bounds":{"left":0.8015292,"top":0.07102953,"width":0.10139628,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Schema-drift tolerant (auto-adds new columns)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.08577128,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Run per date:","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.025265958,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.7962101,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh","depth":24,"bounds":{"left":0.8015292,"top":0.07102953,"width":0.043550532,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(yesterday) or","depth":22,"bounds":{"left":0.84474736,"top":0.07102953,"width":0.027260639,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh 2026-05-12","depth":23,"bounds":{"left":0.87300533,"top":0.07102953,"width":0.05718085,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Current NAS State","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.23238032,"height":0.0007980846},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Current NAS State","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.03656915,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":24,"bounds":{"left":0.7765958,"top":0.07102953,"width":0.021609042,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- 12.9 GB (main archive)","depth":22,"bounds":{"left":0.7982048,"top":0.07102953,"width":0.044215426,"height":0.0007980846},"on_screen":true,"role_description":"text"}]...
|
-5624173553536372063
|
9186163449064030895
|
click
|
accessibility
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 19 pending changes
19
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
lib
screenpipe_sync_db.sh
U
screenpipe_sync_files.sh
U
screenpipe_sync_helpers.sh
U
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
U
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153, 45375
4
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 1, Col 1
Screen Reader Optimized
Info: Setting up SSH Host (details): Creating local forwarding server...
Clear
Refactor Sync Script
15h
2h
1m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State
archive.db
- 12.9 GB (main archive)...
|
31092
|
NULL
|
NULL
|
NULL
|
|
30041
|
1198
|
53
|
2026-05-13T07:23:45.354566+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778657025354_m1.jpg...
|
Windsurf
|
screenpipe [SSH: nas] — screenpipe_sync_helpers.sh
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
lib
screenpipe_sync_helpers.sh
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 42, Col 14
Screen Reader Optimized
expanded
Command Succeeded
Syncing Screenpipe Data
14h
40m
11m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State
archive.db
- 12.9 GB (main archive)
archive.db-bak
- 11.1 GB (backup)
archive.db.bak-pre-installid...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⌥⌘E)","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⇧⌘G) - 15 pending changes","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"15","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Codemaps","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"DeepWiki","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"Run and Debug","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X)","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer","depth":17,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Explorer","depth":18,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Explorer Section: screenpipe [SSH: nas]","depth":21,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: screenpipe [SSH: nas]","depth":22,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"screenpipe [SSH: nas]","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"#recycle","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"data","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"lib","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_helpers.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"logs","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"pipes","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":".gitignore","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app_settings.json","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db-bak","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db.bak-pre-installid","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-shm","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-wal","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync copy.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_old.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_updated.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe.db","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Outline Section","depth":21,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Outline","depth":22,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Outline","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Timeline Section","depth":21,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Timeline","depth":22,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Timeline","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"remote SSH: nas","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SSH: nas","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - master*, Checkout Branch/Tag...","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"master*","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - Synchronize Changes","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"No Problems","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Forwarded Ports: 41257, 36613, 33153","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"3","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Notifications","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Windsurf - Settings","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Teams, Daily: 0% quota used · Weekly: 68% quota used","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Shell Script","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"LF","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"UTF-8","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Spaces: 2","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Ln 42, Col 14","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Screen Reader Optimized","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"expanded","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Command Succeeded","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Syncing Screenpipe Data","depth":20,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"14h","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"40m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"11m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using","depth":25,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"and then run","depth":25,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for a day to move the data. Review and see is you understand everything.","depth":25,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Expand content to scroll","depth":23,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Thought","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 1s","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I'll read the two scripts and explore the project structure to understand the data migration setup.","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thoughts","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Read","depth":20,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh and screenpipe_sync.sh","depth":20,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Analyzed","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/volume1/screenpipe","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thought","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 2s","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I understand the setup. Here's my analysis:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Project Overview","depth":20,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Project Overview","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"You're migrating Screenpipe data from Mac to NAS. The workflow:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"1.","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- One-time database schema migration:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Adds","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"column to all synced tables (prevents ID collisions across reinstalls)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Creates","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"_installs","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"registry table","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Rebuilds FTS (full-text search) tables with new schema","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Optional but recommended","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- running this on the NAS is 5-10x faster than letting","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"do it inline","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Idempotent (safe to re-run)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2.","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- Daily sync script:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies SQLite data from Mac (","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"~/.screenpipe/db.sqlite","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":") to NAS (","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/Volumes/screenpipe/archive.db","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":")","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies frame data files and audio MP4s via rsync","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Uses","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(UUID) to handle multiple screenpipe installations safely","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Schema-drift tolerant (auto-adds new columns)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Run per date:","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(yesterday) or","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh 2026-05-12","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Current NAS State","depth":20,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Current NAS State","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- 12.9 GB (main archive)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db-bak","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- 11.1 GB (backup)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db.bak-pre-installid","depth":24,"on_screen":true,"role_description":"text"}]...
|
6682004471690324550
|
9186163396450705953
|
click
|
accessibility
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
lib
screenpipe_sync_helpers.sh
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 42, Col 14
Screen Reader Optimized
expanded
Command Succeeded
Syncing Screenpipe Data
14h
40m
11m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State
archive.db
- 12.9 GB (main archive)
archive.db-bak
- 11.1 GB (backup)
archive.db.bak-pre-installid...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
29978
|
1199
|
8
|
2026-05-13T07:22:05.932580+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778656925932_m2.jpg...
|
Windsurf
|
screenpipe [SSH: nas] — screenpipe_sync.sh — Modif screenpipe [SSH: nas] — screenpipe_sync.sh — Modified...
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 35, Col 32
Screen Reader Optimized
git-commit Lukas Kovalik (2 weeks ago)
Lukas Kovalik (2 weeks ago)
Info: Setting up SSH Host (details): Creating local forwarding server...
Clear
Syncing Screenpipe Data
14h
34m
5m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State
archive.db
- 12.9 GB (main archive)
archive.db-bak
- 11.1 GB (backup)...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⌥⌘E)","depth":18,"bounds":{"left":0.0,"top":0.047885075,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.05586592,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":18,"bounds":{"left":0.0,"top":0.07581804,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.083798885,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⇧⌘G) - 15 pending changes","depth":18,"bounds":{"left":0.0,"top":0.103751,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.11173184,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"15","depth":21,"bounds":{"left":0.005319149,"top":0.11811652,"width":0.0033244682,"height":0.007980846},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.0056515955,"top":0.118914604,"width":0.0013297872,"height":0.0071827616}},{"char_start":1,"char_count":1,"bounds":{"left":0.006981383,"top":0.118914604,"width":0.0016622341,"height":0.0071827616}}],"role_description":"text"},{"role":"AXRadioButton","text":"Codemaps","depth":18,"bounds":{"left":0.0,"top":0.13168396,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.1396648,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"DeepWiki","depth":18,"bounds":{"left":0.0,"top":0.15961692,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"Run and Debug","depth":18,"bounds":{"left":0.0,"top":0.18754987,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.19553073,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":18,"bounds":{"left":0.0,"top":0.21548285,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.22346368,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X)","depth":18,"bounds":{"left":0.0,"top":0.2434158,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.25139666,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer","depth":17,"bounds":{"left":0.01462766,"top":0.047885075,"width":0.013630319,"height":0.023144454},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Explorer","depth":18,"bounds":{"left":0.01462766,"top":0.054269753,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.014960106,"top":0.055067837,"width":0.0019946808,"height":0.008778931}},{"char_start":1,"char_count":7,"bounds":{"left":0.016954787,"top":0.055067837,"width":0.011303191,"height":0.008778931}}],"role_description":"text"},{"role":"AXButton","text":"Explorer Section: screenpipe [SSH: nas]","depth":21,"bounds":{"left":0.011635638,"top":0.07102953,"width":0.0831117,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.0726257,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: screenpipe [SSH: nas]","depth":22,"bounds":{"left":0.016954787,"top":0.07102953,"width":0.035904255,"height":0.014365523},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"screenpipe [SSH: nas]","depth":23,"bounds":{"left":0.016954787,"top":0.07342378,"width":0.035904255,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.07342378,"width":0.0019946808,"height":0.009577015}},{"char_start":1,"char_count":20,"bounds":{"left":0.018949468,"top":0.07342378,"width":0.034242023,"height":0.009577015}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.08699122,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.08699122,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"#recycle","depth":27,"bounds":{"left":0.025930852,"top":0.08699122,"width":0.01462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.087789305,"width":0.0026595744,"height":0.0103751}},{"char_start":1,"char_count":7,"bounds":{"left":0.02825798,"top":0.087789305,"width":0.012300532,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.10215483,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.10215483,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app","depth":27,"bounds":{"left":0.025930852,"top":0.10215483,"width":0.0063164895,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.10215483,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":2,"bounds":{"left":0.027925532,"top":0.10215483,"width":0.004654255,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.08676862,"top":0.10215483,"width":0.0039893617,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.11652035,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.11652035,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"data","depth":27,"bounds":{"left":0.025930852,"top":0.11652035,"width":0.0076462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.11731844,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":3,"bounds":{"left":0.02825798,"top":0.11731844,"width":0.005319149,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.13088587,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.13088587,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"logs","depth":27,"bounds":{"left":0.025930852,"top":0.13088587,"width":0.006981383,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.13168396,"width":0.0009973404,"height":0.0103751}},{"char_start":1,"char_count":3,"bounds":{"left":0.026928192,"top":0.13168396,"width":0.0063164895,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.08676862,"top":0.13168396,"width":0.0039893617,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.14604948,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.14604948,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"pipes","depth":27,"bounds":{"left":0.025930852,"top":0.14604948,"width":0.00930851,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.14604948,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":4,"bounds":{"left":0.02825798,"top":0.14604948,"width":0.006981383,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.16041501,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":".gitignore","depth":27,"bounds":{"left":0.025930852,"top":0.16041501,"width":0.015957447,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.16121309,"width":0.0013297872,"height":0.0103751}},{"char_start":1,"char_count":9,"bounds":{"left":0.026928192,"top":0.16121309,"width":0.014960106,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.17478053,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app_settings.json","depth":27,"bounds":{"left":0.025930852,"top":0.17478053,"width":0.029920213,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.17557861,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":16,"bounds":{"left":0.027925532,"top":0.17557861,"width":0.027925532,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.18994413,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":27,"bounds":{"left":0.025930852,"top":0.18994413,"width":0.01761968,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.18994413,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":9,"bounds":{"left":0.027925532,"top":0.18994413,"width":0.015625,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.20430966,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db-bak","depth":27,"bounds":{"left":0.025930852,"top":0.20430966,"width":0.025265958,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.20510775,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":13,"bounds":{"left":0.027925532,"top":0.20510775,"width":0.023603724,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.20510775,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.21867518,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db.bak-pre-installid","depth":27,"bounds":{"left":0.025930852,"top":0.21867518,"width":0.046210106,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.21947326,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":27,"bounds":{"left":0.027925532,"top":0.21947326,"width":0.04454787,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.21947326,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.23383878,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite","depth":27,"bounds":{"left":0.025930852,"top":0.23383878,"width":0.01462766,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.23383878,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":8,"bounds":{"left":0.02825798,"top":0.23383878,"width":0.012300532,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.2482043,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-shm","depth":27,"bounds":{"left":0.025930852,"top":0.2482043,"width":0.023271276,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.2490024,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":12,"bounds":{"left":0.02825798,"top":0.2490024,"width":0.021276595,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.26256984,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-wal","depth":27,"bounds":{"left":0.025930852,"top":0.26256984,"width":0.021941489,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.26336792,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":12,"bounds":{"left":0.02825798,"top":0.26336792,"width":0.019614361,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.27773345,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"bounds":{"left":0.025930852,"top":0.27773345,"width":0.04488032,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.27773345,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":24,"bounds":{"left":0.027925532,"top":0.27773345,"width":0.04288564,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.27773345,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.29209897,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync copy.sh","depth":27,"bounds":{"left":0.025930852,"top":0.29209897,"width":0.042220745,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.29289705,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":22,"bounds":{"left":0.027925532,"top":0.29289705,"width":0.04055851,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.29289705,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.3064645,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_updated.sh","depth":27,"bounds":{"left":0.025930852,"top":0.3064645,"width":0.04920213,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.30726257,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":25,"bounds":{"left":0.027925532,"top":0.30726257,"width":0.047539894,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.30726257,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.3216281,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"bounds":{"left":0.025930852,"top":0.3216281,"width":0.03357713,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.3216281,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":17,"bounds":{"left":0.027925532,"top":0.3216281,"width":0.03158245,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"bounds":{"left":0.087101065,"top":0.3216281,"width":0.0033244682,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.33599362,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe.db","depth":27,"bounds":{"left":0.025930852,"top":0.33599362,"width":0.023936171,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.3367917,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":12,"bounds":{"left":0.027925532,"top":0.3367917,"width":0.022273935,"height":0.0103751}}],"role_description":"text"},{"role":"AXButton","text":"Outline Section","depth":21,"bounds":{"left":0.011635638,"top":0.95530725,"width":0.0831117,"height":0.015163607},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.95690346,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Outline","depth":22,"bounds":{"left":0.016954787,"top":0.95530725,"width":0.011635638,"height":0.015163607},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Outline","depth":23,"bounds":{"left":0.016954787,"top":0.9577015,"width":0.011635638,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.9584996,"width":0.0026595744,"height":0.009577015}},{"char_start":1,"char_count":6,"bounds":{"left":0.019946808,"top":0.9584996,"width":0.008976064,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"Timeline Section","depth":21,"bounds":{"left":0.011635638,"top":0.9696728,"width":0.0831117,"height":0.015163607},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.97206706,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Timeline","depth":22,"bounds":{"left":0.016954787,"top":0.97047085,"width":0.013630319,"height":0.014365523},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Timeline","depth":23,"bounds":{"left":0.016954787,"top":0.9728651,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.9728651,"width":0.0023271276,"height":0.009577015}},{"char_start":1,"char_count":7,"bounds":{"left":0.019281914,"top":0.9728651,"width":0.011635638,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"remote SSH: nas","depth":16,"bounds":{"left":0.0016622341,"top":0.9848364,"width":0.024268618,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.0039893617,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SSH: nas","depth":17,"bounds":{"left":0.00831117,"top":0.98723066,"width":0.015292553,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.008643617,"top":0.98723066,"width":0.0009973404,"height":0.009577015}},{"char_start":1,"char_count":7,"bounds":{"left":0.009640957,"top":0.98723066,"width":0.012300532,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - master*, Checkout Branch/Tag...","depth":16,"bounds":{"left":0.027260639,"top":0.9848364,"width":0.019281914,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.027925532,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"master*","depth":17,"bounds":{"left":0.032247342,"top":0.98723066,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.032579787,"top":0.98723066,"width":0.0009973404,"height":0.009577015}},{"char_start":1,"char_count":6,"bounds":{"left":0.03357713,"top":0.98723066,"width":0.010970744,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - Synchronize Changes","depth":16,"bounds":{"left":0.04654255,"top":0.9848364,"width":0.0063164895,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"No Problems","depth":16,"bounds":{"left":0.054853722,"top":0.9848364,"width":0.01861702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.05618351,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"bounds":{"left":0.06050532,"top":0.98723066,"width":0.0043218085,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.064494684,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"bounds":{"left":0.069148935,"top":0.98723066,"width":0.0029920214,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Forwarded Ports: 41257, 36613, 33153","depth":16,"bounds":{"left":0.07513298,"top":0.9848364,"width":0.010305851,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.07646277,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"3","depth":17,"bounds":{"left":0.080784574,"top":0.98723066,"width":0.0033244682,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Notifications","depth":16,"bounds":{"left":0.99102396,"top":0.9848364,"width":0.008976042,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Windsurf - Settings","depth":16,"bounds":{"left":0.9567819,"top":0.9848364,"width":0.03357713,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Teams, Daily: 0% quota used · Weekly: 68% quota used","depth":16,"bounds":{"left":0.9421542,"top":0.9848364,"width":0.012965426,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Shell Script","depth":16,"bounds":{"left":0.91988033,"top":0.9848364,"width":0.020611702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"LF","depth":16,"bounds":{"left":0.9115692,"top":0.9848364,"width":0.0066489363,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"UTF-8","depth":16,"bounds":{"left":0.8969415,"top":0.9848364,"width":0.013297873,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Spaces: 2","depth":16,"bounds":{"left":0.87699467,"top":0.9848364,"width":0.01861702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Ln 35, Col 32","depth":16,"bounds":{"left":0.85139626,"top":0.9848364,"width":0.024268618,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Screen Reader Optimized","depth":16,"bounds":{"left":0.8061835,"top":0.9848364,"width":0.04454787,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"git-commit Lukas Kovalik (2 weeks ago)","depth":16,"bounds":{"left":0.75332445,"top":0.9848364,"width":0.05219415,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.7546542,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Lukas Kovalik (2 weeks ago)","depth":17,"bounds":{"left":0.75897604,"top":0.98723066,"width":0.045212764,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.75897604,"top":0.98723066,"width":0.0013297872,"height":0.009577015}},{"char_start":1,"char_count":26,"bounds":{"left":0.7599734,"top":0.98723066,"width":0.043218084,"height":0.009577015}}],"role_description":"text"},{"role":"AXStaticText","text":"Info: Setting up SSH Host (details): Creating local forwarding server...","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Clear","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Syncing Screenpipe Data","depth":20,"bounds":{"left":0.76163566,"top":0.05347167,"width":0.042220745,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.76163566,"top":0.054269753,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":22,"bounds":{"left":0.76396275,"top":0.054269753,"width":0.039893616,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"14h","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"34m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"5m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using","depth":25,"bounds":{"left":0.7659575,"top":0.07102953,"width":0.2044548,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"bounds":{"left":0.76662236,"top":0.07102953,"width":0.0033244682,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"bounds":{"left":0.7699468,"top":0.07102953,"width":0.04288564,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"and then run","depth":25,"bounds":{"left":0.81349736,"top":0.07102953,"width":0.023603724,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"bounds":{"left":0.83776593,"top":0.07102953,"width":0.0029920214,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"bounds":{"left":0.84075797,"top":0.07102953,"width":0.032247342,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for a day to move the data. Review and see is you understand everything.","depth":25,"bounds":{"left":0.7659575,"top":0.07102953,"width":0.21243352,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Expand content to scroll","depth":23,"bounds":{"left":0.87333775,"top":0.07102953,"width":0.012300532,"height":0.0007980846},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Thought","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.014295213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 1s","depth":21,"bounds":{"left":0.77859044,"top":0.07102953,"width":0.00930851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I'll read the two scripts and explore the project structure to understand the data migration setup.","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.17220744,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thoughts","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.015957447,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Read","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.008643617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh and screenpipe_sync.sh","depth":20,"bounds":{"left":0.77293885,"top":0.07102953,"width":0.08643617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Analyzed","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.015625,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.77958775,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/volume1/screenpipe","depth":22,"bounds":{"left":0.78523934,"top":0.07102953,"width":0.038231384,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thought","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.014295213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 2s","depth":21,"bounds":{"left":0.77859044,"top":0.07102953,"width":0.009640957,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I understand the setup. Here's my analysis:","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.07712766,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Project Overview","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.23238032,"height":0.0007980846},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Project Overview","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.033909574,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"You're migrating Screenpipe data from Mac to NAS. The workflow:","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.118351065,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"1.","depth":22,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.0039893617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.76728725,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":24,"bounds":{"left":0.77293885,"top":0.07102953,"width":0.053856384,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- One-time database schema migration:","depth":21,"bounds":{"left":0.8267952,"top":0.07102953,"width":0.07247341,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Adds","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.010305851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"bounds":{"left":0.78224736,"top":0.07102953,"width":0.01861702,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"column to all synced tables (prevents ID collisions across reinstalls)","depth":22,"bounds":{"left":0.8018617,"top":0.07102953,"width":0.12134308,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Creates","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.015292553,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"_installs","depth":23,"bounds":{"left":0.7869016,"top":0.07102953,"width":0.016954787,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"registry table","depth":22,"bounds":{"left":0.80452126,"top":0.07102953,"width":0.024933511,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Rebuilds FTS (full-text search) tables with new schema","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.09840426,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Optional but recommended","depth":23,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.05119681,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- running this on the NAS is 5-10x faster than letting","depth":22,"bounds":{"left":0.82214093,"top":0.07102953,"width":0.09541223,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.9172208,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"bounds":{"left":0.92287236,"top":0.07102953,"width":0.038896278,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"do it inline","depth":22,"bounds":{"left":0.9617686,"top":0.07102953,"width":0.019614361,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Idempotent (safe to re-run)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.04920213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2.","depth":22,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.004654255,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.76795214,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"bounds":{"left":0.77327126,"top":0.07102953,"width":0.039228722,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- Daily sync script:","depth":21,"bounds":{"left":0.8121675,"top":0.07102953,"width":0.034574468,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies SQLite data from Mac (","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.05518617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"~/.screenpipe/db.sqlite","depth":23,"bounds":{"left":0.8267952,"top":0.07102953,"width":0.04255319,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":") to NAS (","depth":22,"bounds":{"left":0.8703458,"top":0.07102953,"width":0.01761968,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/Volumes/screenpipe/archive.db","depth":23,"bounds":{"left":0.88896275,"top":0.07102953,"width":0.05518617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":")","depth":22,"bounds":{"left":0.94514626,"top":0.07102953,"width":0.0013297872,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.19082446,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies frame data files and audio MP4s via rsync","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.08809841,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Uses","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.010305851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"bounds":{"left":0.78224736,"top":0.07102953,"width":0.018284574,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(UUID) to handle multiple screenpipe installations safely","depth":22,"bounds":{"left":0.8015292,"top":0.07102953,"width":0.10139628,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Schema-drift tolerant (auto-adds new columns)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.08577128,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Run per date:","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.025265958,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.7962101,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh","depth":24,"bounds":{"left":0.8015292,"top":0.07102953,"width":0.043550532,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(yesterday) or","depth":22,"bounds":{"left":0.84474736,"top":0.07102953,"width":0.027260639,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh 2026-05-12","depth":23,"bounds":{"left":0.87300533,"top":0.07102953,"width":0.05718085,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Current NAS State","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.23238032,"height":0.0007980846},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Current NAS State","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.03656915,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":24,"bounds":{"left":0.7765958,"top":0.07102953,"width":0.021609042,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- 12.9 GB (main archive)","depth":22,"bounds":{"left":0.7982048,"top":0.07102953,"width":0.044215426,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db-bak","depth":24,"bounds":{"left":0.7765958,"top":0.07102953,"width":0.03025266,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- 11.1 GB (backup)","depth":22,"bounds":{"left":0.8068484,"top":0.07102953,"width":0.03357713,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"}]...
|
3235243673454093362
|
9186119455646917349
|
click
|
accessibility
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 35, Col 32
Screen Reader Optimized
git-commit Lukas Kovalik (2 weeks ago)
Lukas Kovalik (2 weeks ago)
Info: Setting up SSH Host (details): Creating local forwarding server...
Clear
Syncing Screenpipe Data
14h
34m
5m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State
archive.db
- 12.9 GB (main archive)
archive.db-bak
- 11.1 GB (backup)...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
30016
|
1198
|
39
|
2026-05-13T07:23:03.317656+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778656983317_m1.jpg...
|
Windsurf
|
screenpipe [SSH: nas] — screenpipe_sync.sh — Modif screenpipe [SSH: nas] — screenpipe_sync.sh — Modified...
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 397, Col 21
Screen Reader Optimized
git-commit Not Committed Yet
Not Committed Yet
| tee -a "$LOG_FILE", Inspect this in the accessible view (Option+F2)
Clear
Syncing Screenpipe Data
14h
34m
5m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⌥⌘E)","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⇧⌘G) - 15 pending changes","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"15","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Codemaps","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"DeepWiki","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"Run and Debug","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X)","depth":18,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer","depth":17,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Explorer","depth":18,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Explorer Section: screenpipe [SSH: nas]","depth":21,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: screenpipe [SSH: nas]","depth":22,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"screenpipe [SSH: nas]","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"#recycle","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"data","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"logs","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"pipes","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":".gitignore","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app_settings.json","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db-bak","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db.bak-pre-installid","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-shm","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-wal","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync copy.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_old.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_updated.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe.db","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Outline Section","depth":21,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Outline","depth":22,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Outline","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Timeline Section","depth":21,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Timeline","depth":22,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Timeline","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"remote SSH: nas","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SSH: nas","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - master*, Checkout Branch/Tag...","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"master*","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - Synchronize Changes","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"No Problems","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Forwarded Ports: 41257, 36613, 33153","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"3","depth":17,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Notifications","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Windsurf - Settings","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Teams, Daily: 0% quota used · Weekly: 68% quota used","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Shell Script","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"LF","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"UTF-8","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Spaces: 2","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Ln 397, Col 21","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Screen Reader Optimized","depth":16,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"git-commit Not Committed Yet","depth":16,"bounds":{"left":1.0,"top":0.0,"width":-0.03472221,"height":0.02},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":1.0,"top":0.0,"width":-0.037500024,"height":0.015555556},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Not Committed Yet","depth":17,"bounds":{"left":1.0,"top":0.0,"width":-0.046527743,"height":0.013333334},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"| tee -a \"$LOG_FILE\", Inspect this in the accessible view (Option+F2)","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Clear","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Syncing Screenpipe Data","depth":20,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"14h","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"34m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"5m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using","depth":25,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"and then run","depth":25,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for a day to move the data. Review and see is you understand everything.","depth":25,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Expand content to scroll","depth":23,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Thought","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 1s","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I'll read the two scripts and explore the project structure to understand the data migration setup.","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thoughts","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Read","depth":20,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh and screenpipe_sync.sh","depth":20,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Analyzed","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/volume1/screenpipe","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thought","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 2s","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I understand the setup. Here's my analysis:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Project Overview","depth":20,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Project Overview","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"You're migrating Screenpipe data from Mac to NAS. The workflow:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"1.","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- One-time database schema migration:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Adds","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"column to all synced tables (prevents ID collisions across reinstalls)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Creates","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"_installs","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"registry table","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Rebuilds FTS (full-text search) tables with new schema","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Optional but recommended","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- running this on the NAS is 5-10x faster than letting","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"do it inline","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Idempotent (safe to re-run)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2.","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- Daily sync script:","depth":21,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies SQLite data from Mac (","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"~/.screenpipe/db.sqlite","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":") to NAS (","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/Volumes/screenpipe/archive.db","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":")","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies frame data files and audio MP4s via rsync","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Uses","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(UUID) to handle multiple screenpipe installations safely","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Schema-drift tolerant (auto-adds new columns)","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Run per date:","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh","depth":24,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(yesterday) or","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh 2026-05-12","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Current NAS State","depth":20,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Current NAS State","depth":21,"on_screen":true,"role_description":"text"}]...
|
-1379333363540575427
|
9186093030927737442
|
click
|
accessibility
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 397, Col 21
Screen Reader Optimized
git-commit Not Committed Yet
Not Committed Yet
| tee -a "$LOG_FILE", Inspect this in the accessible view (Option+F2)
Clear
Syncing Screenpipe Data
14h
34m
5m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
30042
|
1199
|
37
|
2026-05-13T07:23:45.386713+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-13/1778 /Users/lukas/.screenpipe/data/data/2026-05-13/1778657025386_m2.jpg...
|
Windsurf
|
screenpipe [SSH: nas] — screenpipe_sync_helpers.sh
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
lib
screenpipe_sync_helpers.sh
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 42, Col 14
Screen Reader Optimized
expanded
Command Succeeded
Syncing Screenpipe Data
14h
40m
11m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⌥⌘E)","depth":18,"bounds":{"left":0.0,"top":0.047885075,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.05586592,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":18,"bounds":{"left":0.0,"top":0.07581804,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.083798885,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⇧⌘G) - 15 pending changes","depth":18,"bounds":{"left":0.0,"top":0.103751,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.11173184,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"15","depth":21,"bounds":{"left":0.005319149,"top":0.11811652,"width":0.0033244682,"height":0.007980846},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.0056515955,"top":0.118914604,"width":0.0013297872,"height":0.0071827616}},{"char_start":1,"char_count":1,"bounds":{"left":0.006981383,"top":0.118914604,"width":0.0016622341,"height":0.0071827616}}],"role_description":"text"},{"role":"AXRadioButton","text":"Codemaps","depth":18,"bounds":{"left":0.0,"top":0.13168396,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.1396648,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"DeepWiki","depth":18,"bounds":{"left":0.0,"top":0.15961692,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"Run and Debug","depth":18,"bounds":{"left":0.0,"top":0.18754987,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.19553073,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":18,"bounds":{"left":0.0,"top":0.21548285,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.22346368,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X)","depth":18,"bounds":{"left":0.0,"top":0.2434158,"width":0.011635638,"height":0.02793296},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":21,"bounds":{"left":0.0033244682,"top":0.25139666,"width":0.004986702,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer","depth":17,"bounds":{"left":0.01462766,"top":0.047885075,"width":0.013630319,"height":0.023144454},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Explorer","depth":18,"bounds":{"left":0.01462766,"top":0.054269753,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.014960106,"top":0.055067837,"width":0.0019946808,"height":0.008778931}},{"char_start":1,"char_count":7,"bounds":{"left":0.016954787,"top":0.055067837,"width":0.011303191,"height":0.008778931}}],"role_description":"text"},{"role":"AXButton","text":"Explorer Section: screenpipe [SSH: nas]","depth":21,"bounds":{"left":0.011635638,"top":0.07102953,"width":0.0831117,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.0726257,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: screenpipe [SSH: nas]","depth":22,"bounds":{"left":0.016954787,"top":0.07102953,"width":0.035904255,"height":0.014365523},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"screenpipe [SSH: nas]","depth":23,"bounds":{"left":0.016954787,"top":0.07342378,"width":0.035904255,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.07342378,"width":0.0019946808,"height":0.009577015}},{"char_start":1,"char_count":20,"bounds":{"left":0.018949468,"top":0.07342378,"width":0.034242023,"height":0.009577015}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.08699122,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.08699122,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"#recycle","depth":27,"bounds":{"left":0.025930852,"top":0.08699122,"width":0.01462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.087789305,"width":0.0026595744,"height":0.0103751}},{"char_start":1,"char_count":7,"bounds":{"left":0.02825798,"top":0.087789305,"width":0.012300532,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.10215483,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.10215483,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app","depth":27,"bounds":{"left":0.025930852,"top":0.10215483,"width":0.0063164895,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.10215483,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":2,"bounds":{"left":0.027925532,"top":0.10215483,"width":0.004654255,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.08676862,"top":0.10215483,"width":0.0039893617,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.11652035,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.11652035,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"data","depth":27,"bounds":{"left":0.025930852,"top":0.11652035,"width":0.0076462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.11731844,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":3,"bounds":{"left":0.02825798,"top":0.11731844,"width":0.005319149,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.01462766,"top":0.13088587,"width":0.0043218085,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.13088587,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"lib","depth":27,"bounds":{"left":0.025930852,"top":0.13088587,"width":0.0039893617,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.13168396,"width":0.0009973404,"height":0.0103751}},{"char_start":1,"char_count":2,"bounds":{"left":0.026928192,"top":0.13168396,"width":0.0033244682,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.021941489,"top":0.14604948,"width":0.003656915,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_helpers.sh","depth":27,"bounds":{"left":0.027925532,"top":0.14604948,"width":0.048204787,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.02825798,"top":0.14604948,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":25,"bounds":{"left":0.029920213,"top":0.14604948,"width":0.046210106,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.16041501,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.16041501,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"logs","depth":27,"bounds":{"left":0.025930852,"top":0.16041501,"width":0.006981383,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.16121309,"width":0.0009973404,"height":0.0103751}},{"char_start":1,"char_count":3,"bounds":{"left":0.026928192,"top":0.16121309,"width":0.0063164895,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.08676862,"top":0.16121309,"width":0.0039893617,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.014295213,"top":0.17478053,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.17478053,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"pipes","depth":27,"bounds":{"left":0.025930852,"top":0.17478053,"width":0.00930851,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.17557861,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":4,"bounds":{"left":0.02825798,"top":0.17557861,"width":0.006981383,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.18994413,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":".gitignore","depth":27,"bounds":{"left":0.025930852,"top":0.18994413,"width":0.015957447,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.18994413,"width":0.0013297872,"height":0.011173184}},{"char_start":1,"char_count":9,"bounds":{"left":0.026928192,"top":0.18994413,"width":0.014960106,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.20430966,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"app_settings.json","depth":27,"bounds":{"left":0.025930852,"top":0.20430966,"width":0.029920213,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.20510775,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":16,"bounds":{"left":0.027925532,"top":0.20510775,"width":0.027925532,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.21867518,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db","depth":27,"bounds":{"left":0.025930852,"top":0.21867518,"width":0.01761968,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.21947326,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":9,"bounds":{"left":0.027925532,"top":0.21947326,"width":0.015625,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.23383878,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db-bak","depth":27,"bounds":{"left":0.025930852,"top":0.23383878,"width":0.025265958,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.23383878,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":13,"bounds":{"left":0.027925532,"top":0.23383878,"width":0.023603724,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.23383878,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.2482043,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"archive.db.bak-pre-installid","depth":27,"bounds":{"left":0.025930852,"top":0.2482043,"width":0.046210106,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.2490024,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":27,"bounds":{"left":0.027925532,"top":0.2490024,"width":0.04454787,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.2490024,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.26256984,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite","depth":27,"bounds":{"left":0.025930852,"top":0.26256984,"width":0.01462766,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.26336792,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":8,"bounds":{"left":0.02825798,"top":0.26336792,"width":0.012300532,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.27773345,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-shm","depth":27,"bounds":{"left":0.025930852,"top":0.27773345,"width":0.023271276,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.27773345,"width":0.0023271276,"height":0.011173184}},{"char_start":1,"char_count":12,"bounds":{"left":0.02825798,"top":0.27773345,"width":0.021276595,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.29209897,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"db.sqlite-wal","depth":27,"bounds":{"left":0.025930852,"top":0.29209897,"width":0.021941489,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.29289705,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":12,"bounds":{"left":0.02825798,"top":0.29289705,"width":0.019614361,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.3064645,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"bounds":{"left":0.025930852,"top":0.3064645,"width":0.04488032,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.30726257,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":24,"bounds":{"left":0.027925532,"top":0.30726257,"width":0.04288564,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.30726257,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.3216281,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync copy.sh","depth":27,"bounds":{"left":0.025930852,"top":0.3216281,"width":0.042220745,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.3216281,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":22,"bounds":{"left":0.027925532,"top":0.3216281,"width":0.04055851,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.3216281,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.33599362,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_old.sh","depth":27,"bounds":{"left":0.025930852,"top":0.33599362,"width":0.04055851,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.3367917,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":21,"bounds":{"left":0.027925532,"top":0.3367917,"width":0.03856383,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.35035914,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync_updated.sh","depth":27,"bounds":{"left":0.025930852,"top":0.35035914,"width":0.04920213,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.35115722,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":25,"bounds":{"left":0.027925532,"top":0.35115722,"width":0.047539894,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"U","depth":27,"bounds":{"left":0.087765954,"top":0.35115722,"width":0.0026595744,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.36552274,"width":0.0039893617,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"bounds":{"left":0.025930852,"top":0.36552274,"width":0.03357713,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.36552274,"width":0.0019946808,"height":0.011173184}},{"char_start":1,"char_count":17,"bounds":{"left":0.027925532,"top":0.36552274,"width":0.03158245,"height":0.011173184}}],"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"bounds":{"left":0.087101065,"top":0.36552274,"width":0.0033244682,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.019614361,"top":0.37988827,"width":0.0039893617,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe.db","depth":27,"bounds":{"left":0.025930852,"top":0.37988827,"width":0.023936171,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.025930852,"top":0.38068634,"width":0.0019946808,"height":0.0103751}},{"char_start":1,"char_count":12,"bounds":{"left":0.027925532,"top":0.38068634,"width":0.022273935,"height":0.0103751}}],"role_description":"text"},{"role":"AXButton","text":"Outline Section","depth":21,"bounds":{"left":0.011635638,"top":0.95530725,"width":0.0831117,"height":0.015163607},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.95690346,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Outline","depth":22,"bounds":{"left":0.016954787,"top":0.95530725,"width":0.011635638,"height":0.015163607},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Outline","depth":23,"bounds":{"left":0.016954787,"top":0.9577015,"width":0.011635638,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.9584996,"width":0.0026595744,"height":0.009577015}},{"char_start":1,"char_count":6,"bounds":{"left":0.019946808,"top":0.9584996,"width":0.008976064,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"Timeline Section","depth":21,"bounds":{"left":0.011635638,"top":0.9696728,"width":0.0831117,"height":0.015163607},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.011968086,"top":0.97206706,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Timeline","depth":22,"bounds":{"left":0.016954787,"top":0.97047085,"width":0.013630319,"height":0.014365523},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Timeline","depth":23,"bounds":{"left":0.016954787,"top":0.9728651,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.017287234,"top":0.9728651,"width":0.0023271276,"height":0.009577015}},{"char_start":1,"char_count":7,"bounds":{"left":0.019281914,"top":0.9728651,"width":0.011635638,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"remote SSH: nas","depth":16,"bounds":{"left":0.0016622341,"top":0.9848364,"width":0.024268618,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.0039893617,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SSH: nas","depth":17,"bounds":{"left":0.00831117,"top":0.98723066,"width":0.015292553,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.008643617,"top":0.98723066,"width":0.0009973404,"height":0.009577015}},{"char_start":1,"char_count":7,"bounds":{"left":0.009640957,"top":0.98723066,"width":0.012300532,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - master*, Checkout Branch/Tag...","depth":16,"bounds":{"left":0.027260639,"top":0.9848364,"width":0.019281914,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.027925532,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"master*","depth":17,"bounds":{"left":0.032247342,"top":0.98723066,"width":0.013630319,"height":0.009577015},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.032579787,"top":0.98723066,"width":0.0009973404,"height":0.009577015}},{"char_start":1,"char_count":6,"bounds":{"left":0.03357713,"top":0.98723066,"width":0.010970744,"height":0.009577015}}],"role_description":"text"},{"role":"AXButton","text":"screenpipe (Git) - Synchronize Changes","depth":16,"bounds":{"left":0.04654255,"top":0.9848364,"width":0.0063164895,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"No Problems","depth":16,"bounds":{"left":0.054853722,"top":0.9848364,"width":0.01861702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.05618351,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"bounds":{"left":0.06050532,"top":0.98723066,"width":0.0043218085,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.064494684,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"0","depth":17,"bounds":{"left":0.069148935,"top":0.98723066,"width":0.0029920214,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Forwarded Ports: 41257, 36613, 33153","depth":16,"bounds":{"left":0.07513298,"top":0.9848364,"width":0.010305851,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":17,"bounds":{"left":0.07646277,"top":0.98643255,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"3","depth":17,"bounds":{"left":0.080784574,"top":0.98723066,"width":0.0033244682,"height":0.009577015},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Notifications","depth":16,"bounds":{"left":0.99102396,"top":0.9848364,"width":0.008976042,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Windsurf - Settings","depth":16,"bounds":{"left":0.9567819,"top":0.9848364,"width":0.03357713,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Teams, Daily: 0% quota used · Weekly: 68% quota used","depth":16,"bounds":{"left":0.9421542,"top":0.9848364,"width":0.012965426,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Shell Script","depth":16,"bounds":{"left":0.91988033,"top":0.9848364,"width":0.020611702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"LF","depth":16,"bounds":{"left":0.9115692,"top":0.9848364,"width":0.0066489363,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"UTF-8","depth":16,"bounds":{"left":0.8969415,"top":0.9848364,"width":0.013297873,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Spaces: 2","depth":16,"bounds":{"left":0.87699467,"top":0.9848364,"width":0.01861702,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Ln 42, Col 14","depth":16,"bounds":{"left":0.85139626,"top":0.9848364,"width":0.024268618,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXButton","text":"Screen Reader Optimized","depth":16,"bounds":{"left":0.8061835,"top":0.9848364,"width":0.04454787,"height":0.014365523},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"expanded","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Command Succeeded","depth":12,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"Syncing Screenpipe Data","depth":20,"bounds":{"left":0.7659575,"top":0.05347167,"width":0.04255319,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.7659575,"top":0.054269753,"width":0.0026595744,"height":0.0103751}},{"char_start":1,"char_count":22,"bounds":{"left":0.76828456,"top":0.054269753,"width":0.040226065,"height":0.0103751}}],"role_description":"text"},{"role":"AXStaticText","text":"14h","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"40m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"11m","depth":19,"on_screen":false,"role_description":"text"},{"role":"AXStaticText","text":"get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using","depth":25,"bounds":{"left":0.7659575,"top":0.07102953,"width":0.2044548,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"bounds":{"left":0.76662236,"top":0.07102953,"width":0.0033244682,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":27,"bounds":{"left":0.7699468,"top":0.07102953,"width":0.04288564,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"and then run","depth":25,"bounds":{"left":0.81349736,"top":0.07102953,"width":0.023603724,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"@","depth":27,"bounds":{"left":0.83776593,"top":0.07102953,"width":0.0029920214,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":27,"bounds":{"left":0.84075797,"top":0.07102953,"width":0.032247342,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for a day to move the data. Review and see is you understand everything.","depth":25,"bounds":{"left":0.7659575,"top":0.07102953,"width":0.21243352,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Expand content to scroll","depth":23,"bounds":{"left":0.87333775,"top":0.07102953,"width":0.012300532,"height":0.0007980846},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"Thought","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.014295213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 1s","depth":21,"bounds":{"left":0.77859044,"top":0.07102953,"width":0.00930851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I'll read the two scripts and explore the project structure to understand the data migration setup.","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.17220744,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thoughts","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.015957447,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Read","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.008643617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh and screenpipe_sync.sh","depth":20,"bounds":{"left":0.77293885,"top":0.07102953,"width":0.08643617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Analyzed","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.015625,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.77958775,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/volume1/screenpipe","depth":22,"bounds":{"left":0.78523934,"top":0.07102953,"width":0.038231384,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Thought","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.014295213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"for 2s","depth":21,"bounds":{"left":0.77859044,"top":0.07102953,"width":0.009640957,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"I understand the setup. Here's my analysis:","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.07712766,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Project Overview","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.23238032,"height":0.0007980846},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Project Overview","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.033909574,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"You're migrating Screenpipe data from Mac to NAS. The workflow:","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.118351065,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"1.","depth":22,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.0039893617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.76728725,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_fts_migrate.sh","depth":24,"bounds":{"left":0.77293885,"top":0.07102953,"width":0.053856384,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- One-time database schema migration:","depth":21,"bounds":{"left":0.8267952,"top":0.07102953,"width":0.07247341,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Adds","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.010305851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"bounds":{"left":0.78224736,"top":0.07102953,"width":0.01861702,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"column to all synced tables (prevents ID collisions across reinstalls)","depth":22,"bounds":{"left":0.8018617,"top":0.07102953,"width":0.12134308,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Creates","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.015292553,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"_installs","depth":23,"bounds":{"left":0.7869016,"top":0.07102953,"width":0.016954787,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"registry table","depth":22,"bounds":{"left":0.80452126,"top":0.07102953,"width":0.024933511,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Rebuilds FTS (full-text search) tables with new schema","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.09840426,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Optional but recommended","depth":23,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.05119681,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- running this on the NAS is 5-10x faster than letting","depth":22,"bounds":{"left":0.82214093,"top":0.07102953,"width":0.09541223,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.9172208,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"bounds":{"left":0.92287236,"top":0.07102953,"width":0.038896278,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"do it inline","depth":22,"bounds":{"left":0.9617686,"top":0.07102953,"width":0.019614361,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Idempotent (safe to re-run)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.04920213,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2.","depth":22,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.004654255,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.76795214,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"screenpipe_sync.sh","depth":24,"bounds":{"left":0.77327126,"top":0.07102953,"width":0.039228722,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"- Daily sync script:","depth":21,"bounds":{"left":0.8121675,"top":0.07102953,"width":0.034574468,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies SQLite data from Mac (","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.05518617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"~/.screenpipe/db.sqlite","depth":23,"bounds":{"left":0.8267952,"top":0.07102953,"width":0.04255319,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":") to NAS (","depth":22,"bounds":{"left":0.8703458,"top":0.07102953,"width":0.01761968,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"/Volumes/screenpipe/archive.db","depth":23,"bounds":{"left":0.88896275,"top":0.07102953,"width":0.05518617,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":")","depth":22,"bounds":{"left":0.94514626,"top":0.07102953,"width":0.0013297872,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.19082446,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Copies frame data files and audio MP4s via rsync","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.08809841,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Uses","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.010305851,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"install_id","depth":23,"bounds":{"left":0.78224736,"top":0.07102953,"width":0.018284574,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(UUID) to handle multiple screenpipe installations safely","depth":22,"bounds":{"left":0.8015292,"top":0.07102953,"width":0.10139628,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Schema-drift tolerant (auto-adds new columns)","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.08577128,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Run per date:","depth":22,"bounds":{"left":0.7709442,"top":0.07102953,"width":0.025265958,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":24,"bounds":{"left":0.7962101,"top":0.07102953,"width":0.0023271276,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh","depth":24,"bounds":{"left":0.8015292,"top":0.07102953,"width":0.043550532,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"(yesterday) or","depth":22,"bounds":{"left":0.84474736,"top":0.07102953,"width":0.027260639,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"./screenpipe_sync.sh 2026-05-12","depth":23,"bounds":{"left":0.87300533,"top":0.07102953,"width":0.05718085,"height":0.0007980846},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Current NAS State","depth":20,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.23238032,"height":0.0007980846},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"Current NAS State","depth":21,"bounds":{"left":0.76329786,"top":0.07102953,"width":0.03656915,"height":0.0007980846},"on_screen":true,"role_description":"text"}]...
|
-7059454017222197515
|
9186093027706511905
|
click
|
accessibility
|
NULL
|
Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘ Explorer (⌥⌘E)
Search (⇧⌘F)
Source Control (⇧⌘G) - 15 pending changes
15
Codemaps
DeepWiki
Run and Debug
Remote Explorer
Extensions (⇧⌘X)
Explorer
Explorer
Explorer Section: screenpipe [SSH: nas]
Explorer Section: screenpipe [SSH: nas]
screenpipe [SSH: nas]
#recycle
app
data
lib
screenpipe_sync_helpers.sh
logs
pipes
.gitignore
app_settings.json
archive.db
archive.db-bak
U
archive.db.bak-pre-installid
U
db.sqlite
db.sqlite-shm
db.sqlite-wal
screenpipe_fts_migrate.sh
U
screenpipe_sync copy.sh
U
screenpipe_sync_old.sh
screenpipe_sync_updated.sh
U
screenpipe_sync.sh
M
screenpipe.db
Outline Section
Outline
Outline
Timeline Section
Timeline
Timeline
remote SSH: nas
SSH: nas
screenpipe (Git) - master*, Checkout Branch/Tag...
master*
screenpipe (Git) - Synchronize Changes
No Problems
0
0
Forwarded Ports: 41257, 36613, 33153
3
Notifications
Windsurf - Settings
Teams, Daily: 0% quota used · Weekly: 68% quota used
Shell Script
LF
UTF-8
Spaces: 2
Ln 42, Col 14
Screen Reader Optimized
expanded
Command Succeeded
Syncing Screenpipe Data
14h
40m
11m
get familiar with the project. The idea is to copy the data from mac to nas (here). I am tryign to update the database using
@
screenpipe_fts_migrate.sh
and then run
@
screenpipe_sync.sh
for a day to move the data. Review and see is you understand everything.
Expand content to scroll
Thought
for 1s
I'll read the two scripts and explore the project structure to understand the data migration setup.
Thoughts
Read
screenpipe_fts_migrate.sh and screenpipe_sync.sh
Analyzed
/volume1/screenpipe
Thought
for 2s
I understand the setup. Here's my analysis:
Project Overview
Project Overview
You're migrating Screenpipe data from Mac to NAS. The workflow:
1.
screenpipe_fts_migrate.sh
- One-time database schema migration:
Adds
install_id
column to all synced tables (prevents ID collisions across reinstalls)
Creates
_installs
registry table
Rebuilds FTS (full-text search) tables with new schema
Optional but recommended
- running this on the NAS is 5-10x faster than letting
screenpipe_sync.sh
do it inline
Idempotent (safe to re-run)
2.
screenpipe_sync.sh
- Daily sync script:
Copies SQLite data from Mac (
~/.screenpipe/db.sqlite
) to NAS (
/Volumes/screenpipe/archive.db
)
Syncs vision tables (frames, elements, OCR, UI events) and audio tables (chunks, transcriptions, speakers)
Copies frame data files and audio MP4s via rsync
Uses
install_id
(UUID) to handle multiple screenpipe installations safely
Schema-drift tolerant (auto-adds new columns)
Run per date:
./screenpipe_sync.sh
(yesterday) or
./screenpipe_sync.sh 2026-05-12
Current NAS State
Current NAS State...
|
30040
|
NULL
|
NULL
|
NULL
|
|
19040
|
816
|
6
|
2026-05-11T12:13:39.582402+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-11/1778 /Users/lukas/.screenpipe/data/data/2026-05-11/1778501619582_m1.jpg...
|
Code
|
Review rate limit handli… — app
|
1
|
NULL
|
monitor_1
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 22 pending changes
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Testing
Claude Code
EXPLORER
EXPLORER
Explorer Section: app
Explorer Section: app
APP
CheckAndRetryRemoteMatch.php
CreateFollowupActivity.php
CreateNotes.php
MatchActivitiesToNewOpportunity.php
MatchActivityCrmData.php
M
NoteObject.php
SaveActivity.php
SaveTranscription.php
SetupLayout.php
SyncActivity.php
SyncFieldMetadata.php
SyncHubspotObjects.php
SyncLeads.php
SyncObjects.php
SyncOpportunitiesJob.php
SyncOpportunity.php
SyncProfileMetadata.php
SyncTeamFieldsJob.php
SyncTeamMetadata.php
UpdateOpportunitySpecifications.php
UpdateStage.php
DealRisks
Mailbox
MeetingBot
Middleware
...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⇧⌘E)","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⌃⇧G) - 22 pending changes","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXRadioButton","text":"Run and Debug (⇧⌘D)","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X) - 2 require update","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Testing","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Claude Code","depth":19,"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"EXPLORER","depth":17,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"EXPLORER","depth":18,"on_screen":true,"role_description":"text"},{"role":"AXButton","text":"Explorer Section: app","depth":21,"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: app","depth":22,"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"APP","depth":23,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CheckAndRetryRemoteMatch.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CreateFollowupActivity.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CreateNotes.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"MatchActivitiesToNewOpportunity.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"MatchActivityCrmData.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"NoteObject.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SaveActivity.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SaveTranscription.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SetupLayout.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncActivity.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncFieldMetadata.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncHubspotObjects.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncLeads.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncObjects.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncOpportunitiesJob.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncOpportunity.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncProfileMetadata.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncTeamFieldsJob.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncTeamMetadata.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"UpdateOpportunitySpecifications.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"UpdateStage.php","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"DealRisks","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Mailbox","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"MeetingBot","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Middleware","depth":27,"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"on_screen":true,"role_description":"text"}]...
|
394509188446610786
|
9185404705839186695
|
click
|
accessibility
|
NULL
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 22 pending changes
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Testing
Claude Code
EXPLORER
EXPLORER
Explorer Section: app
Explorer Section: app
APP
CheckAndRetryRemoteMatch.php
CreateFollowupActivity.php
CreateNotes.php
MatchActivitiesToNewOpportunity.php
MatchActivityCrmData.php
M
NoteObject.php
SaveActivity.php
SaveTranscription.php
SetupLayout.php
SyncActivity.php
SyncFieldMetadata.php
SyncHubspotObjects.php
SyncLeads.php
SyncObjects.php
SyncOpportunitiesJob.php
SyncOpportunity.php
SyncProfileMetadata.php
SyncTeamFieldsJob.php
SyncTeamMetadata.php
UpdateOpportunitySpecifications.php
UpdateStage.php
DealRisks
Mailbox
MeetingBot
Middleware
...
|
NULL
|
NULL
|
NULL
|
NULL
|
|
18939
|
813
|
13
|
2026-05-11T12:01:50.748649+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-11/1778 /Users/lukas/.screenpipe/data/data/2026-05-11/1778500910748_m2.jpg...
|
Code
|
MatchActivityCrmData.php — app — Modified
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 22 pending changes
22
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Testing
Claude Code
EXPLORER
EXPLORER
Explorer Section: app
Explorer Section: app
APP
AjReports
Audio
AutomatedReports
Calendar
Crm
Delete
Hubspot
Traits
FetchMergedObjectsPageJob.php
HubspotAppUninstallJob.php
ImportAccountBatch.php
ImportBatchJobTrait.php
ImportContactBatch.php
ImportOpportunityBatch.php
ProcessHubspotWebhookEventsTrait.php
ProcessInternalWebhookEventsJob.php
ProcessMergedObjectJob.php
ProcessWebhookEventsJob.php
UpdateDealWebhookSubscriptionJob.php
Salesforce
AutologDelayedToCrm.php
CheckAndRetryRemoteMatch.php
CreateFollowupActivity.php
CreateNotes.php
MatchActivitiesToNewOpportunity.php
MatchActivityCrmData.php
M
NoteObject.php
SaveActivity.php
SaveTranscription.php
SetupLayout.php
SyncActivity.php
SyncFieldMetadata.php
SyncHubspotObjects.php
SyncLeads.php
SyncObjects.php
SyncOpportunitiesJob.php
SyncOpportunity.php
SyncProfileMetadata.php
SyncTeamFieldsJob.php
SyncTeamMetadata.php
UpdateOpportunitySpecifications.php...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⇧⌘E)","depth":19,"bounds":{"left":0.0,"top":0.047885075,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.057462092,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":19,"bounds":{"left":0.0,"top":0.08619314,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.09577015,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⌃⇧G) - 22 pending changes","depth":19,"bounds":{"left":0.0,"top":0.1245012,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.13407822,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"22","depth":22,"bounds":{"left":0.007978723,"top":0.1452514,"width":0.0039893617,"height":0.008778931},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.007978723,"top":0.14604948,"width":0.0023271276,"height":0.007980846}},{"char_start":1,"char_count":1,"bounds":{"left":0.009973404,"top":0.14604948,"width":0.0019946808,"height":0.007980846}}],"role_description":"text"},{"role":"AXRadioButton","text":"Run and Debug (⇧⌘D)","depth":19,"bounds":{"left":0.0,"top":0.16280925,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.17238627,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":19,"bounds":{"left":0.0,"top":0.20111732,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.21069433,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X) - 2 require update","depth":19,"bounds":{"left":0.0,"top":0.23942538,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.2490024,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2","depth":22,"bounds":{"left":0.009640957,"top":0.2601756,"width":0.0019946808,"height":0.008778931},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Testing","depth":19,"bounds":{"left":0.0,"top":0.27773345,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.28731045,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Claude Code","depth":19,"bounds":{"left":0.0,"top":0.3160415,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"EXPLORER","depth":17,"bounds":{"left":0.022606382,"top":0.047885075,"width":0.018949468,"height":0.02793296},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"EXPLORER","depth":18,"bounds":{"left":0.022606382,"top":0.056664005,"width":0.018949468,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.022606382,"top":0.056664005,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":7,"bounds":{"left":0.024933511,"top":0.056664005,"width":0.01662234,"height":0.0103751}}],"role_description":"text"},{"role":"AXButton","text":"Explorer Section: app","depth":21,"bounds":{"left":0.015957447,"top":0.07581804,"width":0.09940159,"height":0.017557861},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.01662234,"top":0.07821229,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: app","depth":22,"bounds":{"left":0.022606382,"top":0.07581804,"width":0.0076462766,"height":0.017557861},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"APP","depth":23,"bounds":{"left":0.022606382,"top":0.079010375,"width":0.0076462766,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.0933759,"width":0.005319149,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"AjReports","depth":27,"bounds":{"left":0.03125,"top":0.0933759,"width":0.019614361,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.0933759,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":8,"bounds":{"left":0.034242023,"top":0.0933759,"width":0.01662234,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.110135674,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Audio","depth":27,"bounds":{"left":0.03125,"top":0.110135674,"width":0.011635638,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.11093376,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":4,"bounds":{"left":0.034242023,"top":0.11093376,"width":0.008643617,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.12769353,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"AutomatedReports","depth":27,"bounds":{"left":0.03125,"top":0.12769353,"width":0.03756649,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.12849163,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":15,"bounds":{"left":0.034242023,"top":0.12849163,"width":0.034906916,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.1452514,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Calendar","depth":27,"bounds":{"left":0.03125,"top":0.1452514,"width":0.017952127,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.14604948,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":7,"bounds":{"left":0.034242023,"top":0.14604948,"width":0.015292553,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.16280925,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Crm","depth":27,"bounds":{"left":0.03125,"top":0.16280925,"width":0.00831117,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.10605053,"top":0.16360734,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.027593086,"top":0.18036711,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Delete","depth":27,"bounds":{"left":0.033909574,"top":0.18036711,"width":0.012965426,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.1811652,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":5,"bounds":{"left":0.03723404,"top":0.1811652,"width":0.009640957,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.027593086,"top":0.19792499,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Hubspot","depth":27,"bounds":{"left":0.033909574,"top":0.19792499,"width":0.017287234,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.19872306,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":6,"bounds":{"left":0.03723404,"top":0.19872306,"width":0.013962766,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.03025266,"top":0.21548285,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Traits","depth":27,"bounds":{"left":0.03656915,"top":0.21548285,"width":0.011303191,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.21628092,"width":0.0023271276,"height":0.011971269}},{"char_start":1,"char_count":5,"bounds":{"left":0.038896278,"top":0.21628092,"width":0.008976064,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.23144454,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"FetchMergedObjectsPageJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.2330407,"width":0.06881649,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.23383878,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":28,"bounds":{"left":0.039228722,"top":0.23383878,"width":0.06615692,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.2490024,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"HubspotAppUninstallJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.25059855,"width":0.059175532,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.25139666,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.039893616,"top":0.25139666,"width":0.055851065,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.26656026,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportAccountBatch.php","depth":27,"bounds":{"left":0.03656915,"top":0.26815644,"width":0.050531916,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.26895452,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":21,"bounds":{"left":0.037898935,"top":0.26895452,"width":0.04920213,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.28411812,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportBatchJobTrait.php","depth":27,"bounds":{"left":0.03656915,"top":0.2857143,"width":0.049867023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.28651237,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":22,"bounds":{"left":0.037898935,"top":0.28651237,"width":0.048537236,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.30167598,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportContactBatch.php","depth":27,"bounds":{"left":0.03656915,"top":0.30327216,"width":0.049867023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.30407023,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":21,"bounds":{"left":0.037898935,"top":0.30407023,"width":0.048537236,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.31923383,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportOpportunityBatch.php","depth":27,"bounds":{"left":0.03656915,"top":0.32083002,"width":0.057845745,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.3216281,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.037898935,"top":0.3216281,"width":0.056848403,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.3367917,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessHubspotWebhookEventsTrait.php","depth":27,"bounds":{"left":0.03656915,"top":0.33838788,"width":0.077792555,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.33918595,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":35,"bounds":{"left":0.039228722,"top":0.33918595,"width":0.080784574,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.35434955,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessInternalWebhookEventsJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.35594574,"width":0.078457445,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.3567438,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":34,"bounds":{"left":0.039228722,"top":0.3567438,"width":0.0774601,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.3719074,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessMergedObjectJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.3735036,"width":0.061170213,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.37430167,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.039228722,"top":0.37430167,"width":0.058843084,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.38946527,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessWebhookEventsJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.39106146,"width":0.06482713,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.39185953,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":26,"bounds":{"left":0.039228722,"top":0.39185953,"width":0.06216755,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.40702313,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"UpdateDealWebhookSubscriptionJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.4086193,"width":0.076130316,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.4094174,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":35,"bounds":{"left":0.039893616,"top":0.4094174,"width":0.08111702,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.027593086,"top":0.42617717,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Salesforce","depth":27,"bounds":{"left":0.033909574,"top":0.42617717,"width":0.021276595,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.42697525,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":9,"bounds":{"left":0.03656915,"top":0.42697525,"width":0.01861702,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.44213888,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"AutologDelayedToCrm.php","depth":27,"bounds":{"left":0.033909574,"top":0.44373503,"width":0.053856384,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.4445331,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":22,"bounds":{"left":0.036901597,"top":0.4445331,"width":0.05119681,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.45969674,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CheckAndRetryRemoteMatch.php","depth":27,"bounds":{"left":0.033909574,"top":0.4612929,"width":0.068484046,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.46209097,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":27,"bounds":{"left":0.036901597,"top":0.46209097,"width":0.06549202,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.4772546,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CreateFollowupActivity.php","depth":27,"bounds":{"left":0.033909574,"top":0.47885075,"width":0.054853722,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.47964883,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.036901597,"top":0.47964883,"width":0.051861703,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.49481246,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CreateNotes.php","depth":27,"bounds":{"left":0.033909574,"top":0.4964086,"width":0.034242023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.49720672,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":14,"bounds":{"left":0.036901597,"top":0.49720672,"width":0.03125,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.5123703,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"MatchActivitiesToNewOpportunity.php","depth":27,"bounds":{"left":0.033909574,"top":0.5139665,"width":0.07712766,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.51476455,"width":0.0039893617,"height":0.011971269}},{"char_start":1,"char_count":34,"bounds":{"left":0.037898935,"top":0.51476455,"width":0.07347074,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.52992815,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"MatchActivityCrmData.php","depth":27,"bounds":{"left":0.033909574,"top":0.53152436,"width":0.054521278,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.5323224,"width":0.0039893617,"height":0.011971269}},{"char_start":1,"char_count":23,"bounds":{"left":0.037898935,"top":0.5323224,"width":0.050531916,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"bounds":{"left":0.10638298,"top":0.5323224,"width":0.003656915,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.547486,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"NoteObject.php","depth":27,"bounds":{"left":0.033909574,"top":0.5490822,"width":0.031914894,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.54988027,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":13,"bounds":{"left":0.03723404,"top":0.54988027,"width":0.028922873,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.56504387,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SaveActivity.php","depth":27,"bounds":{"left":0.033909574,"top":0.5666401,"width":0.03324468,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.5674381,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":15,"bounds":{"left":0.03656915,"top":0.5674381,"width":0.030585106,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.5826017,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SaveTranscription.php","depth":27,"bounds":{"left":0.033909574,"top":0.58419794,"width":0.04454787,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.584996,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":20,"bounds":{"left":0.03656915,"top":0.584996,"width":0.042220745,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.60015965,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SetupLayout.php","depth":27,"bounds":{"left":0.033909574,"top":0.6017558,"width":0.034242023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.60255384,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":14,"bounds":{"left":0.03656915,"top":0.60255384,"width":0.031914894,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.6177175,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncActivity.php","depth":27,"bounds":{"left":0.033909574,"top":0.61931366,"width":0.03357713,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.6201117,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":15,"bounds":{"left":0.03656915,"top":0.6201117,"width":0.030917553,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.63527536,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncFieldMetadata.php","depth":27,"bounds":{"left":0.033909574,"top":0.6368715,"width":0.047539894,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.63766956,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":20,"bounds":{"left":0.03656915,"top":0.63766956,"width":0.04488032,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.6528332,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncHubspotObjects.php","depth":27,"bounds":{"left":0.033909574,"top":0.6544294,"width":0.051861703,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.6552275,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":21,"bounds":{"left":0.03656915,"top":0.6552275,"width":0.04920213,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.6703911,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncLeads.php","depth":27,"bounds":{"left":0.033909574,"top":0.67198724,"width":0.030917553,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.67278534,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":12,"bounds":{"left":0.03656915,"top":0.67278534,"width":0.02825798,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.68794894,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncObjects.php","depth":27,"bounds":{"left":0.033909574,"top":0.6895451,"width":0.034574468,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.6903432,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":14,"bounds":{"left":0.03656915,"top":0.6903432,"width":0.031914894,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7055068,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncOpportunitiesJob.php","depth":27,"bounds":{"left":0.033909574,"top":0.70710295,"width":0.053856384,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.70790106,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":23,"bounds":{"left":0.03656915,"top":0.70790106,"width":0.05119681,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.72306466,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncOpportunity.php","depth":27,"bounds":{"left":0.033909574,"top":0.7246608,"width":0.04288564,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.7254589,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":18,"bounds":{"left":0.03656915,"top":0.7254589,"width":0.040226065,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7406225,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncProfileMetadata.php","depth":27,"bounds":{"left":0.033909574,"top":0.7422187,"width":0.05086436,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.7430168,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":22,"bounds":{"left":0.03656915,"top":0.7430168,"width":0.048204787,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7581804,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncTeamFieldsJob.php","depth":27,"bounds":{"left":0.033909574,"top":0.75977653,"width":0.04886968,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.77573824,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncTeamMetadata.php","depth":27,"bounds":{"left":0.033909574,"top":0.7773344,"width":0.048537236,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7932961,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"UpdateOpportunitySpecifications.php","depth":27,"bounds":{"left":0.033909574,"top":0.79489225,"width":0.076130316,"height":0.011971269},"on_screen":true,"role_description":"text"}]...
|
-8343544251264563800
|
9185263399333994826
|
click
|
accessibility
|
NULL
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 22 pending changes
22
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Testing
Claude Code
EXPLORER
EXPLORER
Explorer Section: app
Explorer Section: app
APP
AjReports
Audio
AutomatedReports
Calendar
Crm
Delete
Hubspot
Traits
FetchMergedObjectsPageJob.php
HubspotAppUninstallJob.php
ImportAccountBatch.php
ImportBatchJobTrait.php
ImportContactBatch.php
ImportOpportunityBatch.php
ProcessHubspotWebhookEventsTrait.php
ProcessInternalWebhookEventsJob.php
ProcessMergedObjectJob.php
ProcessWebhookEventsJob.php
UpdateDealWebhookSubscriptionJob.php
Salesforce
AutologDelayedToCrm.php
CheckAndRetryRemoteMatch.php
CreateFollowupActivity.php
CreateNotes.php
MatchActivitiesToNewOpportunity.php
MatchActivityCrmData.php
M
NoteObject.php
SaveActivity.php
SaveTranscription.php
SetupLayout.php
SyncActivity.php
SyncFieldMetadata.php
SyncHubspotObjects.php
SyncLeads.php
SyncObjects.php
SyncOpportunitiesJob.php
SyncOpportunity.php
SyncProfileMetadata.php
SyncTeamFieldsJob.php
SyncTeamMetadata.php
UpdateOpportunitySpecifications.php...
|
NULL
|
/Users/lukas/jiminny/app/app/Jobs/Crm/MatchActivit /Users/lukas/jiminny/app/app/Jobs/Crm/MatchActivityCrmData.php...
|
NULL
|
NULL
|
|
18959
|
813
|
25
|
2026-05-11T12:04:46.281308+00:00
|
/Users/lukas/.screenpipe/data/data/2026-05-11/1778 /Users/lukas/.screenpipe/data/data/2026-05-11/1778501086281_m2.jpg...
|
Code
|
Review rate limit handli… — app
|
1
|
NULL
|
monitor_2
|
NULL
|
NULL
|
NULL
|
NULL
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 22 pending changes
22
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Testing
Claude Code
EXPLORER
EXPLORER
Explorer Section: app
Explorer Section: app
APP
AjReports
Audio
AutomatedReports
Calendar
Crm
Delete
Hubspot
Traits
FetchMergedObjectsPageJob.php
HubspotAppUninstallJob.php
ImportAccountBatch.php
ImportBatchJobTrait.php
ImportContactBatch.php
ImportOpportunityBatch.php
ProcessHubspotWebhookEventsTrait.php
ProcessInternalWebhookEventsJob.php
ProcessMergedObjectJob.php
ProcessWebhookEventsJob.php
UpdateDealWebhookSubscriptionJob.php
Salesforce
AutologDelayedToCrm.php
CheckAndRetryRemoteMatch.php
CreateFollowupActivity.php
CreateNotes.php
MatchActivitiesToNewOpportunity.php
MatchActivityCrmData.php
M
NoteObject.php
SaveActivity.php
SaveTranscription.php
SetupLayout.php
SyncActivity.php
SyncFieldMetadata.php
SyncHubspotObjects.php
SyncLeads.php
SyncObjects.php
SyncOpportunitiesJob.php
SyncOpportunity.php
SyncProfileMetadata.php
SyncTeamFieldsJob.php
SyncTeamMetadata.php
UpdateOpportunitySpecifications.php...
|
[{"role":"AXRadioButton","text [{"role":"AXRadioButton","text":"Explorer (⇧⌘E)","depth":19,"bounds":{"left":0.0,"top":0.047885075,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":true,"is_expanded":true},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.057462092,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Search (⇧⌘F)","depth":19,"bounds":{"left":0.0,"top":0.08619314,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.09577015,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Source Control (⌃⇧G) - 22 pending changes","depth":19,"bounds":{"left":0.0,"top":0.1245012,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.13407822,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"22","depth":22,"bounds":{"left":0.007978723,"top":0.1452514,"width":0.0039893617,"height":0.008778931},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.007978723,"top":0.14604948,"width":0.0023271276,"height":0.007980846}},{"char_start":1,"char_count":1,"bounds":{"left":0.009973404,"top":0.14604948,"width":0.0019946808,"height":0.007980846}}],"role_description":"text"},{"role":"AXRadioButton","text":"Run and Debug (⇧⌘D)","depth":19,"bounds":{"left":0.0,"top":0.16280925,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.17238627,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Remote Explorer","depth":19,"bounds":{"left":0.0,"top":0.20111732,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.21069433,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Extensions (⇧⌘X) - 2 require update","depth":19,"bounds":{"left":0.0,"top":0.23942538,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.2490024,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"2","depth":22,"bounds":{"left":0.009640957,"top":0.2601756,"width":0.0019946808,"height":0.008778931},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Testing","depth":19,"bounds":{"left":0.0,"top":0.27773345,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXStaticText","text":"","depth":22,"bounds":{"left":0.0039893617,"top":0.28731045,"width":0.007978723,"height":0.01915403},"on_screen":true,"role_description":"text"},{"role":"AXRadioButton","text":"Claude Code","depth":19,"bounds":{"left":0.0,"top":0.3160415,"width":0.015957447,"height":0.03830806},"on_screen":true,"role_description":"tab","subrole":"AXTabButton","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":false},{"role":"AXHeading","text":"EXPLORER","depth":17,"bounds":{"left":0.022606382,"top":0.047885075,"width":0.018949468,"height":0.02793296},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"EXPLORER","depth":18,"bounds":{"left":0.022606382,"top":0.056664005,"width":0.018949468,"height":0.0103751},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.022606382,"top":0.056664005,"width":0.0023271276,"height":0.0103751}},{"char_start":1,"char_count":7,"bounds":{"left":0.024933511,"top":0.056664005,"width":0.01662234,"height":0.0103751}}],"role_description":"text"},{"role":"AXButton","text":"Explorer Section: app","depth":21,"bounds":{"left":0.015957447,"top":0.07581804,"width":0.09940159,"height":0.017557861},"on_screen":true,"role_description":"button","is_enabled":true,"is_focused":false,"is_selected":false,"is_expanded":true},{"role":"AXStaticText","text":"","depth":23,"bounds":{"left":0.01662234,"top":0.07821229,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXHeading","text":"Explorer Section: app","depth":22,"bounds":{"left":0.022606382,"top":0.07581804,"width":0.0076462766,"height":0.017557861},"on_screen":true,"role_description":"heading"},{"role":"AXStaticText","text":"APP","depth":23,"bounds":{"left":0.022606382,"top":0.079010375,"width":0.0076462766,"height":0.0103751},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.0933759,"width":0.005319149,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"AjReports","depth":27,"bounds":{"left":0.03125,"top":0.0933759,"width":0.019614361,"height":0.011173184},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.0933759,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":8,"bounds":{"left":0.034242023,"top":0.0933759,"width":0.01662234,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.110135674,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Audio","depth":27,"bounds":{"left":0.03125,"top":0.110135674,"width":0.011635638,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.11093376,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":4,"bounds":{"left":0.034242023,"top":0.11093376,"width":0.008643617,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.12769353,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"AutomatedReports","depth":27,"bounds":{"left":0.03125,"top":0.12769353,"width":0.03756649,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.12849163,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":15,"bounds":{"left":0.034242023,"top":0.12849163,"width":0.034906916,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.1452514,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Calendar","depth":27,"bounds":{"left":0.03125,"top":0.1452514,"width":0.017952127,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03125,"top":0.14604948,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":7,"bounds":{"left":0.034242023,"top":0.14604948,"width":0.015292553,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.024933511,"top":0.16280925,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Crm","depth":27,"bounds":{"left":0.03125,"top":0.16280925,"width":0.00831117,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.10605053,"top":0.16360734,"width":0.004654255,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.027593086,"top":0.18036711,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Delete","depth":27,"bounds":{"left":0.033909574,"top":0.18036711,"width":0.012965426,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.1811652,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":5,"bounds":{"left":0.03723404,"top":0.1811652,"width":0.009640957,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.027593086,"top":0.19792499,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Hubspot","depth":27,"bounds":{"left":0.033909574,"top":0.19792499,"width":0.017287234,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.19872306,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":6,"bounds":{"left":0.03723404,"top":0.19872306,"width":0.013962766,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.03025266,"top":0.21548285,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Traits","depth":27,"bounds":{"left":0.03656915,"top":0.21548285,"width":0.011303191,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.21628092,"width":0.0023271276,"height":0.011971269}},{"char_start":1,"char_count":5,"bounds":{"left":0.038896278,"top":0.21628092,"width":0.008976064,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.23144454,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"FetchMergedObjectsPageJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.2330407,"width":0.06881649,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.23383878,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":28,"bounds":{"left":0.039228722,"top":0.23383878,"width":0.06615692,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.2490024,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"HubspotAppUninstallJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.25059855,"width":0.059175532,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.25139666,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.039893616,"top":0.25139666,"width":0.055851065,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.26656026,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportAccountBatch.php","depth":27,"bounds":{"left":0.03656915,"top":0.26815644,"width":0.050531916,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.26895452,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":21,"bounds":{"left":0.037898935,"top":0.26895452,"width":0.04920213,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.28411812,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportBatchJobTrait.php","depth":27,"bounds":{"left":0.03656915,"top":0.2857143,"width":0.049867023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.28651237,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":22,"bounds":{"left":0.037898935,"top":0.28651237,"width":0.048537236,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.30167598,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportContactBatch.php","depth":27,"bounds":{"left":0.03656915,"top":0.30327216,"width":0.049867023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.30407023,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":21,"bounds":{"left":0.037898935,"top":0.30407023,"width":0.048537236,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.31923383,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ImportOpportunityBatch.php","depth":27,"bounds":{"left":0.03656915,"top":0.32083002,"width":0.057845745,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.3216281,"width":0.0013297872,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.037898935,"top":0.3216281,"width":0.056848403,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.3367917,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessHubspotWebhookEventsTrait.php","depth":27,"bounds":{"left":0.03656915,"top":0.33838788,"width":0.077792555,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.33918595,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":35,"bounds":{"left":0.039228722,"top":0.33918595,"width":0.080784574,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.35434955,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessInternalWebhookEventsJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.35594574,"width":0.078457445,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.3567438,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":34,"bounds":{"left":0.039228722,"top":0.3567438,"width":0.0774601,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.3719074,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessMergedObjectJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.3735036,"width":0.061170213,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.37430167,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.039228722,"top":0.37430167,"width":0.058843084,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.38946527,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"ProcessWebhookEventsJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.39106146,"width":0.06482713,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.39185953,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":26,"bounds":{"left":0.039228722,"top":0.39185953,"width":0.06216755,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.02925532,"top":0.40702313,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"UpdateDealWebhookSubscriptionJob.php","depth":27,"bounds":{"left":0.03656915,"top":0.4086193,"width":0.076130316,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.03656915,"top":0.4094174,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":35,"bounds":{"left":0.039893616,"top":0.4094174,"width":0.08111702,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":26,"bounds":{"left":0.027593086,"top":0.42617717,"width":0.005319149,"height":0.012769354},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"Salesforce","depth":27,"bounds":{"left":0.033909574,"top":0.42617717,"width":0.021276595,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.42697525,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":9,"bounds":{"left":0.03656915,"top":0.42697525,"width":0.01861702,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.44213888,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"AutologDelayedToCrm.php","depth":27,"bounds":{"left":0.033909574,"top":0.44373503,"width":0.053856384,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.4445331,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":22,"bounds":{"left":0.036901597,"top":0.4445331,"width":0.05119681,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.45969674,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CheckAndRetryRemoteMatch.php","depth":27,"bounds":{"left":0.033909574,"top":0.4612929,"width":0.068484046,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.46209097,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":27,"bounds":{"left":0.036901597,"top":0.46209097,"width":0.06549202,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.4772546,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CreateFollowupActivity.php","depth":27,"bounds":{"left":0.033909574,"top":0.47885075,"width":0.054853722,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.47964883,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":25,"bounds":{"left":0.036901597,"top":0.47964883,"width":0.051861703,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.49481246,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"CreateNotes.php","depth":27,"bounds":{"left":0.033909574,"top":0.4964086,"width":0.034242023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.49720672,"width":0.0029920214,"height":0.011971269}},{"char_start":1,"char_count":14,"bounds":{"left":0.036901597,"top":0.49720672,"width":0.03125,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.5123703,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"MatchActivitiesToNewOpportunity.php","depth":27,"bounds":{"left":0.033909574,"top":0.5139665,"width":0.07712766,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.51476455,"width":0.0039893617,"height":0.011971269}},{"char_start":1,"char_count":34,"bounds":{"left":0.037898935,"top":0.51476455,"width":0.07347074,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.52992815,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"MatchActivityCrmData.php","depth":27,"bounds":{"left":0.033909574,"top":0.53152436,"width":0.054521278,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.5323224,"width":0.0039893617,"height":0.011971269}},{"char_start":1,"char_count":23,"bounds":{"left":0.037898935,"top":0.5323224,"width":0.050531916,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"M","depth":27,"bounds":{"left":0.10638298,"top":0.5323224,"width":0.003656915,"height":0.011173184},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.547486,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"NoteObject.php","depth":27,"bounds":{"left":0.033909574,"top":0.5490822,"width":0.031914894,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.54988027,"width":0.0033244682,"height":0.011971269}},{"char_start":1,"char_count":13,"bounds":{"left":0.03723404,"top":0.54988027,"width":0.028922873,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.56504387,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SaveActivity.php","depth":27,"bounds":{"left":0.033909574,"top":0.5666401,"width":0.03324468,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.5674381,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":15,"bounds":{"left":0.03656915,"top":0.5674381,"width":0.030585106,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.5826017,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SaveTranscription.php","depth":27,"bounds":{"left":0.033909574,"top":0.58419794,"width":0.04454787,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.584996,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":20,"bounds":{"left":0.03656915,"top":0.584996,"width":0.042220745,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.60015965,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SetupLayout.php","depth":27,"bounds":{"left":0.033909574,"top":0.6017558,"width":0.034242023,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.60255384,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":14,"bounds":{"left":0.03656915,"top":0.60255384,"width":0.031914894,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.6177175,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncActivity.php","depth":27,"bounds":{"left":0.033909574,"top":0.61931366,"width":0.03357713,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.6201117,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":15,"bounds":{"left":0.03656915,"top":0.6201117,"width":0.030917553,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.63527536,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncFieldMetadata.php","depth":27,"bounds":{"left":0.033909574,"top":0.6368715,"width":0.047539894,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.63766956,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":20,"bounds":{"left":0.03656915,"top":0.63766956,"width":0.04488032,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.6528332,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncHubspotObjects.php","depth":27,"bounds":{"left":0.033909574,"top":0.6544294,"width":0.051861703,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.6552275,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":21,"bounds":{"left":0.03656915,"top":0.6552275,"width":0.04920213,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.6703911,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncLeads.php","depth":27,"bounds":{"left":0.033909574,"top":0.67198724,"width":0.030917553,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.67278534,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":12,"bounds":{"left":0.03656915,"top":0.67278534,"width":0.02825798,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.68794894,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncObjects.php","depth":27,"bounds":{"left":0.033909574,"top":0.6895451,"width":0.034574468,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.6903432,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":14,"bounds":{"left":0.03656915,"top":0.6903432,"width":0.031914894,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7055068,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncOpportunitiesJob.php","depth":27,"bounds":{"left":0.033909574,"top":0.70710295,"width":0.053856384,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.70790106,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":23,"bounds":{"left":0.03656915,"top":0.70790106,"width":0.05119681,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.72306466,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncOpportunity.php","depth":27,"bounds":{"left":0.033909574,"top":0.7246608,"width":0.04288564,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.7254589,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":18,"bounds":{"left":0.03656915,"top":0.7254589,"width":0.040226065,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7406225,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncProfileMetadata.php","depth":27,"bounds":{"left":0.033909574,"top":0.7422187,"width":0.05086436,"height":0.011971269},"on_screen":true,"lines":[{"char_start":0,"char_count":1,"bounds":{"left":0.033909574,"top":0.7430168,"width":0.0026595744,"height":0.011971269}},{"char_start":1,"char_count":22,"bounds":{"left":0.03656915,"top":0.7430168,"width":0.048204787,"height":0.011971269}}],"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7581804,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncTeamFieldsJob.php","depth":27,"bounds":{"left":0.033909574,"top":0.75977653,"width":0.04886968,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.77573824,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"SyncTeamMetadata.php","depth":27,"bounds":{"left":0.033909574,"top":0.7773344,"width":0.048537236,"height":0.011971269},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"","depth":27,"bounds":{"left":0.026595745,"top":0.7932961,"width":0.0063164895,"height":0.015163607},"on_screen":true,"role_description":"text"},{"role":"AXStaticText","text":"UpdateOpportunitySpecifications.php","depth":27,"bounds":{"left":0.033909574,"top":0.79489225,"width":0.076130316,"height":0.011971269},"on_screen":true,"role_description":"text"}]...
|
-8343544251264563800
|
9185263399333994826
|
visual_change
|
accessibility
|
NULL
|
Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧ Explorer (⇧⌘E)
Search (⇧⌘F)
Source Control (⌃⇧G) - 22 pending changes
22
Run and Debug (⇧⌘D)
Remote Explorer
Extensions (⇧⌘X) - 2 require update
2
Testing
Claude Code
EXPLORER
EXPLORER
Explorer Section: app
Explorer Section: app
APP
AjReports
Audio
AutomatedReports
Calendar
Crm
Delete
Hubspot
Traits
FetchMergedObjectsPageJob.php
HubspotAppUninstallJob.php
ImportAccountBatch.php
ImportBatchJobTrait.php
ImportContactBatch.php
ImportOpportunityBatch.php
ProcessHubspotWebhookEventsTrait.php
ProcessInternalWebhookEventsJob.php
ProcessMergedObjectJob.php
ProcessWebhookEventsJob.php
UpdateDealWebhookSubscriptionJob.php
Salesforce
AutologDelayedToCrm.php
CheckAndRetryRemoteMatch.php
CreateFollowupActivity.php
CreateNotes.php
MatchActivitiesToNewOpportunity.php
MatchActivityCrmData.php
M
NoteObject.php
SaveActivity.php
SaveTranscription.php
SetupLayout.php
SyncActivity.php
SyncFieldMetadata.php
SyncHubspotObjects.php
SyncLeads.php
SyncObjects.php
SyncOpportunitiesJob.php
SyncOpportunity.php
SyncProfileMetadata.php
SyncTeamFieldsJob.php
SyncTeamMetadata.php
UpdateOpportunitySpecifications.php...
|
NULL
|
NULL
|
NULL
|
NULL
|