Features
Everything Talkie can do, explained in detail.
Voice Input
Talkie is built voice-first. The Web Speech API provides browser-native speech-to-text and text-to-speech with no external services or API keys required.
Push-to-talk
Hold the Spacebar or tap the microphone button to start recording. Release to send. This is the fastest way to interact with Talkie when you know exactly what you want to say.
Wake word
Say "Hey Talkie" to activate hands-free recording. The wake word is configurable in settings — change it to any phrase you like. When the wake word is detected, Talkie begins listening and waits for your message.
Continuous listening mode
When enabled, Talkie automatically restarts the microphone after each response. This creates a natural back-and-forth conversation where you never need to press a button. The cycle continues until you cancel or disable continuous mode.
Trigger word
The default trigger word is "over". When Talkie hears the trigger word, it knows your message is complete and sends it to Claude. The trigger word is configurable in settings.
Silence detection
After detecting the trigger word (or in push-to-talk mode), Talkie waits for a brief silence before finalizing the transcription. The silence delay is configurable from 0.5 to 3.0 seconds, letting you control how much pause time signals the end of your message.
Text-to-speech
Claude's responses are spoken aloud using the browser's built-in TTS engine. You can select from any system voice installed on your device via the settings panel. Different voices are available on macOS, Windows, and Linux.
Sound effects
Talkie plays synthesized audio tones for key events to provide clear feedback without relying on visual cues:
- Start — a rising tone when recording begins
- Stop — a falling tone when recording ends
- Thinking — a subtle pulse while waiting for Claude's response
- Success — a confirmation chime when the response arrives
- Error — a low buzz when something goes wrong
Cassette Tape UI
Every conversation in Talkie is stored as a cassette tape. The entire interface is built around this metaphor, from the input bar to the collection browser.
Tape deck
The bottom input bar is the tape deck. It contains the microphone button, text input field, file attachment button, and send button. This is where you interact with the current tape — recording new messages and viewing playback.
Tape collection
Open the drawer to see all your conversations displayed as illustrated cassette cards. Each card shows the conversation title, message count, and a visual cassette illustration. The collection supports scrolling and filtering.
Visual details
Cassette tapes have rich visual properties:
- 8 label colors — each tape gets a unique label color for easy identification
- Reel sizes — reel size scales with message count, so longer conversations have visibly larger reels
- Eject animations — tapes animate out when you switch conversations
- 5 sizes — mini, small, medium, large, and xl cassette renderings
- 4 body colors — different cassette body colors for variety
- 5 states — idle, recording, playing, paused, and ejecting, each with distinct visual treatment
Robot Avatar
The robot avatar in the header is a fully CSS-only animated character. No images or SVGs are used — every part is built with CSS shapes, gradients, and animations.
6 states
The robot transitions between states to reflect what Talkie is doing:
- Idle — relaxed, with gentle eye movement and a slow antenna pulse
- Listening — alert posture, antenna glowing, eyes wide
- Thinking — eyes shift side to side, grille lines pulse, LED blinks
- Speaking — mouth animates, grille pulses with speech rhythm
- Happy — eyes curve into arcs, antenna tip brightens, LED turns green
- Confused — one eye larger than the other, antenna tilts, LED turns amber
Per-state animations
Each state has dedicated CSS animations:
- Eye movement — pupils shift direction based on state
- Antenna glow — the antenna tip changes color and brightness
- Grille pulsing — the speaker grille lines animate during speech and thinking
- LED color — the status LED changes between green, amber, red, and blue
Theme integration
Each theme defines its own robot colors via CSS custom properties (--robot-body, --robot-shadow, --robot-highlight, --screen-idle, etc.). The robot looks completely different in each theme while using the same markup.
Interactive
Hover over the robot to trigger the happy state. This provides a small moment of delight and confirms the avatar is responding to the current theme's color scheme.
Plans System
Talkie can detect and manage structured plans from Claude's responses, turning conversational output into actionable workflows.
Auto-detection
Plans are automatically extracted from Claude's messages when they contain certain patterns:
- Headings followed by numbered steps
- Numbered lists with action items
- Checkbox-style lists (
- [ ]and- [x])
Side panel
The plans panel slides open from the right side of the screen. It shows a list of all detected plans and a detail view for the selected plan. Plans can be viewed, edited, and managed without leaving the conversation.
Status workflow
Each plan follows a lifecycle:
- Draft — newly detected, not yet reviewed
- Approved — reviewed and accepted
- In Progress — actively being worked on
- Completed — all steps finished
- Archived — stored for reference, hidden from active views
Management
Plans can be edited, deleted, or linked to specific conversations. Linking a plan to a conversation creates a cross-reference so you can always trace a plan back to the discussion that created it.
Liner Notes
Every cassette tape can have liner notes — a per-conversation notes panel where you can add your own annotations, summaries, or reference material.
Markdown support
Liner notes support common Markdown formatting:
- Headings (
#,##,###) - Bold and italic text
- Inline
codeand code blocks - Bulleted and numbered lists
Pin from chat
You can pin message content directly from the chat timeline into the liner notes. This lets you quickly save important snippets — a command, a URL, a key decision — without manually copying and pasting.
Search
Talkie provides fast, full-text search across all your conversations using SQLite FTS5.
Global search
Press Cmd + K (or Ctrl + K on Windows/Linux) to open the search overlay. It appears instantly and focuses the input field so you can start typing immediately.
FTS5 full-text search
Search is powered by SQLite's FTS5 extension, which provides fast ranked full-text search across all stored messages. Results are ranked by relevance and returned with highlighted matching snippets.
Keyboard navigation
Navigate search results without touching the mouse:
- Arrow Up / Arrow Down — move between results
- Enter — open the selected result
- Escape — close the search overlay
Tape collection filtering
The tape collection drawer also supports inline filtering. Type in the filter field to narrow the displayed tapes by title or content, providing a quick way to find a specific conversation without opening the full search overlay.
Context Linking
Context linking lets you give Claude memory across separate conversations by selecting past tapes as context for the current one.
How it works
Toggle past conversations on or off as context sources. When context is enabled, the selected conversations are merged chronologically and included in the prompt sent to Claude. This means Claude can reference information from previous tapes without you needing to repeat it.
Use cases
- Continue a thread from a previous conversation
- Reference decisions made in earlier sessions
- Build on code or plans from past discussions
- Provide background context for new questions
Image Handling
Talkie supports images throughout the conversation, from input to analysis to browsing.
Input methods
- Drag and drop — drag an image file directly onto the chat area
- File picker — click the attach button in the tape deck to browse for files
Auto-analysis
When an image is attached, Talkie automatically sends it to Claude for analysis using the vision capability (Sonnet 4). Claude describes what it sees and can answer questions about the image content.
Storage
Images are stored as base64 data URLs in the database. Each image has an editable description field that you can modify to add your own notes or correct Claude's auto-generated description.
Media library
The media library provides a browsable view of all images across your conversations. You can search, filter, and open any image in a full-screen lightbox for detailed viewing.
Lightbox
Click any image in the chat timeline or media library to open it in a full-screen lightbox. The lightbox supports zooming and panning for detailed inspection.
Background Jobs
Background jobs let you run long-running tasks on the server without blocking the UI.
How it works
Jobs are created via the REST API or MCP tools. Each job runs asynchronously on the server and streams progress updates via Server-Sent Events (SSE). The frontend polls for updates and displays live progress in the job status bar.
Creating jobs
You can create background jobs through:
- REST API —
POST /api/jobswith a prompt and optional conversation ID - MCP tools — use the
create_talkie_jobtool from Claude Code
SSE streaming
Job events are streamed in real-time using Server-Sent Events. The frontend subscribes to the job's SSE endpoint and updates the UI as events arrive — showing progress, partial results, and completion status.
Job status bar
Active jobs appear in a status bar at the top of the screen. Each job shows its current state and a summary of progress. Click a job to see its full detail panel.
Auto-polling
The frontend polls for job updates every 3 seconds to ensure the UI stays current even if SSE connections are interrupted.
Statuses
Jobs move through these states:
- Queued — waiting to start
- Running — actively executing
- Completed — finished successfully with a result
- Failed — encountered an error
- Cancelled — stopped by the user
Export
Export any conversation for backup, sharing, or archival purposes.
Markdown export
Press Cmd + E (or Ctrl + E) to export the current conversation as a Markdown file. The export includes all messages formatted with proper headings, code blocks, and metadata like timestamps and roles.
JSON export
For programmatic use, export conversations as JSON. The JSON format includes the complete conversation data: messages, liner notes, metadata, creation dates, and all associated properties.
Keyboard Shortcuts
Talkie supports keyboard shortcuts for common actions. Press ? to toggle the shortcuts guide overlay.
| Shortcut | Action |
|---|---|
| Spacebar (hold) | Push-to-talk — hold to record, release to send |
| Escape | Cancel recording |
| Cmd + K | Open global search |
| Cmd + E | Export current conversation |
| ? | Toggle keyboard shortcuts guide |
Mobile
Talkie works on mobile devices with a touch-optimized interface built around the Floating Action Button (FAB).
Floating Action Button
On mobile, the primary recording control is a floating action button that hovers over the interface. It provides quick access to voice recording from anywhere in the app.
Draggable and resizable
The FAB can be repositioned by long-pressing and dragging it to any corner of the screen. It can also be resized to suit your preference. This flexibility ensures it never blocks important content.
Interaction
- Tap — start or stop voice recording
- Long-press — enter drag mode to reposition the button
Persistence
The FAB's position is persisted in localStorage, so it stays where you put it across page reloads and sessions.