For speech & language therapists
Home practice and aided family communication — designed to sit alongside your care.
StrokeVoice is a daily companion for adults living with aphasia after stroke. It holds two things together in one place: structured home practice built around techniques you already use, and an aided way for the person to reach out to family and friends on the days speech, reading or writing are hard. It is not therapy and not a medical device. It reports progress honestly, and it treats functional communication as an outcome in its own right.
What it is — and what it isn’t
We have written the product, the marketing and the in-app copy to a strict positioning discipline:
- StrokeVoice is a wellness, practice and communication companion, not a medical device.
- The app guides exercises informed by published research. We do not call this “evidence-based therapy”.
- The first-time setup is a calibration — not a diagnosis or a standardised assessment.
- Users are referred to as users, not patients.
- The conversational AI is a practice partner — never a clinician, never offering medical advice, never grading grammar.
- The Reach Out messaging layer is a consumer communication tool, not a prescribed AAC device. It sits at the wellness end of the aided-communication spectrum and is designed to complement — not replace — any AAC provision a user already has.
This discipline matters: it is what keeps the product compliant with UK and US app-store policies for non-medical wellness software, and it is what keeps us honest with the people who use it.
Reach Out — the aided-communication layer
Alongside the practice side, StrokeVoice includes a two-way messaging surface called Reach Out. The aim is not to replace direct speech; it is to reduce the long gaps of silence that follow aphasia — the unanswered group chats, the missed calls from grandchildren, the messages started and abandoned — and to give the person a calm, low-pressure way back in on the days when speech, reading or writing are hard.
Reach Out offers four equal-weight routes into a message — speak (voice-to-text with an AI-drafted clean-up, raw transcript always retained), pictures (pictogram-to-sentence using the Mulberry Symbols library under CC BY-SA 2.0, ~3,500 symbols bundled with the app), quick phrases (30 defaults across six pragmatic categories, user-extensible), and aided typing (word suggestions seeded from the user’s own prior output, LLM sentence completion, frequency fallback, fuzzy-match tolerance for typos). Every message is shown as a draft the user reviews before sending, so the act of sending is always a deliberate tap.
The person on the other end does not need the app — messages are delivered via WhatsApp Business and arrive as a normal WhatsApp message. Incoming replies route back into the StrokeVoice conversation thread and are spoken aloud on tap via on-device TTS, acknowledging that reading can be as impaired as speaking. Predictive-reply chips offer aphasia-aware ways back into a conversation without ever auto-sending.
Clinical framing we are careful about:
- Reach Out is not an AAC device in the sense of a prescribed speech-generating device. We do not claim AAC outcomes or bill against AAC funding pathways.
- AI-composed messages carry a visible “sent with StrokeVoice” footer, so recipients are not misled about authorship.
- The Reach Out signal feeds the lived-experience side of the measurement framework (functional conversations logged, CETI-informed carer ratings), never the objective speech signals — trained-item and probe accuracy remain isolated from messaging activity.
- Carer-managed messaging is deliberately not in the MVP. The account holder drives messaging; carers support the practice side.
The techniques the practice draws on today
Three published approaches are running in the product now, implemented in their published form where practical and described honestly where adapted:
- Semantic Feature Analysis (Boyle & Coelho, 1995). Confrontation-naming with a hierarchical cue ladder. The first three cue levels in our ladder map directly to SFA: category → use/function → salient properties. Cue level is recorded per item as
highest_cue_level_used, so you can see whether the user is retrieving independently or relying on semantic scaffolding. - Phonological Component Analysis (Leonard, Rochon & Laird, 2008). The final two cue levels provide phonological support: first sound → full model. Same event record; the level at which the user produced the target is the clinically meaningful signal.
- Script Training (Youmans, Holland & Munoz, 2005; Cherney, 2012). A dedicated Script module ladders through a five-stage practice progression — Listen → Read-along → With-cues → Solo → Generalise — with per-stage events stored for audit. Scripts are personalised from the user’s own situations where captured during onboarding.
Stimuli are personalised to the user’s own vocabulary, photos and hobbies wherever captured. The practice partner adjusts difficulty using an adaptive layer that targets a 70–85% success band and detects within-session fatigue; the layer, and every cue-level event, is inspectable in a clinician-facing export.
On the roadmap — not shipped yet
We are upfront about what is not in the product today. The following are planned and scoped, but we do not present them as live until the implementation and (where relevant) SLT partnership is in place:
- Response Elaboration Training (RET). A conversational-AI session type where the partner echoes and elaborates the user’s own production, rather than cueing towards a target. Near-term build.
- VNeST (Verb Network Strengthening Treatment). Requires a verb stimulus set curated with an SLT; we are not claiming VNeST until that set exists.
- Aphasia-calibrated ASR. A phonemic-near-match scoring layer on top of the underlying ASR, so atypical productions are counted as attempted rather than misheard as unrelated words. Planned — today we run stock
faster-whisper small.en. - Aphasia-aware typing dictionary. A paraphasia substitution pass over the typing suggestion pipeline, reviewed quarterly with an SLT.
Techniques we have deliberately scoped out of a self-directed app: Melodic Intonation Therapy (clinically sensitive, benefits from live supervision), Multi-Modal Aphasia Therapy (requires drawing/writing/gesture inputs that are real product work), and the intensity profiles of CIAT/CILT (those protocols assume daily clinician-supervised dose). We are not claiming any of these.
Full protocol references are on the evidence page.
The four-layer measurement framework
The progress story is the differentiator. Existing apps tend to answer “is she actually getting better?” with streaks and proprietary scores. We answer it with four parallel layers, deliberately separated by cadence and source:
- Layer 1 — quiet telemetry, every session. Accuracy, response latency, self-corrections, and cue level are captured per item from normal practice. Stored as
SessionItemEventrows; used by the adaptive layer to keep difficulty inside the 70–85% success band. - Layer 2 — untrained probe (fortnightly, ~3 min). A held-out probe set is kept in a separate pool in the database (
ItemPool.PROBE), selected by a different code path from training items. Probe accuracy is gated by a Minimum Detectable Change threshold before any “you’ve improved” framing is shown; the cadence is a scheduled check the user is invited to, not a quiz. - Layer 3 — connected-speech Voice Journal (monthly, ~3 min). A short connected-speech sample using the same prompts each time. WPM, MLU, TTR and a CIU proxy (Nicholas & Brookshire, 1993) are computed per sample. A 4-week rolling-mean dashboard view is on the near-term roadmap; today each sample is available for clinician audit alongside its audio.
- Layer 4 — check-in (available any time, ~3 min). A CCRSA-informed user confidence rating, an optional CETI-informed carer rating, and a place to log functional wins across twelve everyday categories (family, phone call, café order, shopping, medical, community, hobby, reading, work, celebration, self-care, other).
Cadence note: probes and Voice Journal samples are intentionally less frequent than weekly — shorter windows give more responsive feedback, longer windows give more statistically stable estimates. We default to the longer window so users aren’t shown noise as progress, but the cadence is configurable in progress/scheduling.py and can be tightened on request. A clinician-facing export exposes per-sample values for audit.
Product invariants — built in, not optional
- Trained and untrained items are kept separate in the database, so practice cannot contaminate a probe.
- No composite “aphasia score.” Trained accuracy, untrained probe accuracy, connected-speech metrics and self/carer-rated signals are reported alongside each other, never collapsed into one number.
- No comparison to other users. The only meaningful comparison is to the user’s own previous self.
- Plateaus are reported honestly. A flat stretch is shown as a flat stretch, not dressed up as progress.
What StrokeVoice explicitly does not do
- It is not a diagnostic or assessment tool. It does not replace WAB-R, CAT, BNT or any standardised assessment.
- It is not a substitute for direct work with a qualified SLT.
- It does not claim a treatment effect. It tracks change and reports it honestly.
- It does not use voice recordings to train third-party models. Model improvement is a separate, opt-in, revocable consent.
How it fits between your sessions
The literature consistently links improvement to distributed, repeated practice at meaningful dose — the RELEASE meta-analysis (Brady et al., Stroke, 2022, n=959) finds the strongest gains in the 20–50 hour band. Direct contact time is finite; what happens in the days between sessions usually isn’t structured at all. StrokeVoice exists to make that time structured, calm and recordable.
Sessions are short by default (5–10 minutes), repeatable across the week, and adjustable by the carer or user. Where a clinician partnership is in place, target sets and probe items can be reviewed and adjusted.
The Reach Out layer addresses a second gap that practice alone does not fill: the social withdrawal that routinely follows post-stroke aphasia, and its known association with depression, carer strain and reduced quality of life. Functional, low-pressure exchanges with family and friends throughout the week are an outcome in their own right, and are what the Communicative Effectiveness Index (Lomas et al., 1989) was designed to capture. StrokeVoice is the only companion we know of that combines home practice and aphasia-friendly two-way family messaging in the same product — if you are aware of prior art we should look at, we would like to hear from you.
Safety guardrails
- The conversational practice partner is constrained by a published safety prompt: no medical advice, no diagnosis, no medication or prognosis commentary, no grammar correction.
- Risk-flag content (self-harm, acute medical concerns) is routed away from the model and into clear signposting (NHS 111, Stroke Association, Samaritans).
- A clear “this is not therapy” position is shown at onboarding and surfaced inside relevant flows.
- The app does not promise — and never will promise — a specific outcome.
Get involved
We’re actively seeking clinician reviewers, pilot sites and advisory input — particularly from SLTs working in stroke and community aphasia services in the UK. Tell us what we’ve got wrong, what we should add, and what we shouldn’t be claiming.