Built by someone who's hard of hearing
I’ve had bilateral sensorineural hearing loss since childhood. I wear hearing aids. And I’ve spent years re-asking questions that were already answered - not because I wasn’t paying attention, but because I genuinely didn’t hear. Samioo came from wanting a clearer picture of when and how often that actually happens.
Audiograms are the gold standard for hearing evaluation, but they measure detection in silence. Real hearing difficulty - the kind that matters - happens in restaurants, in cars, in everyday conversation. That gap is what Samioo is trying to close.
Grounded in research
Samioo’s detection methodology is grounded in published research on conversational hearing patterns. A 2025 Google Research paper demonstrated that repetition requests (explicit "What?" moments) occur at a base rate of roughly 1 in 1,000 utterances in normal conversation - and can be detected automatically with high reliability (F1 ~0.87).
What the paper didn’t address - and what Samioo adds - is semantic re-asking: questions that ask for information already provided earlier in the conversation. These are subtler than a direct "What?" but represent the same underlying phenomenon. Samioo detects both.
Research basis
arXiv:2507.23590 - Conversational hearing comprehension detection via repetition request analysis. Validated on real-world conversation data. Methodology published and peer-reviewed.
How it works technically
Samioo is powered by Echo, an open hearing analysis pipeline. When you submit a recording, it runs through six automated steps:
- Audio ingestion and validation
- Speaker-diarized transcription via AssemblyAI - each utterance tagged by who said it
- Semantic embeddings of every utterance via OpenAI text-embedding-3-small
- Similarity analysis: cosine distance + sliding window to find semantic re-asking
- Pattern matching for explicit repetition requests (Type A)
- PDF report generation with acoustic summary, detection timeline, and confidence scoring
The pipeline runs entirely in the cloud. Your audio is processed, your report is generated, and the data is handled according to our privacy policy.
Why we’re collecting data
No public dataset of real-world hard of hearing (HOH) conversations exists. Every recording submitted to Samioo - with your consent - contributes to the first one.
That dataset is what will eventually allow us to train a classifier that goes beyond rule-based detection: one that can identify hearing comprehension difficulty directly from audio waveforms, without needing a full transcript, and that can distinguish genuine hearing failure from conversational noise.
We're at the beginning. The goal is to build something genuinely useful for audiologists and for the people they treat - and to do it with real data from real people, not synthetic benchmarks.
Want to contribute?
Volunteer participants receive a free personalized hearing analysis report. Your recording directly advances the research.