惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

GbyAI
GbyAI
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
P
Proofpoint News Feed
L
Lohrmann on Cybersecurity
S
Secure Thoughts
Attack and Defense Labs
Attack and Defense Labs
人人都是产品经理
人人都是产品经理
Stack Overflow Blog
Stack Overflow Blog
W
WeLiveSecurity
O
OpenAI News
SecWiki News
SecWiki News
博客园 - Franky
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
T
Tor Project blog
Microsoft Security Blog
Microsoft Security Blog
aimingoo的专栏
aimingoo的专栏
Security Latest
Security Latest
H
Hacker News: Front Page
Google Online Security Blog
Google Online Security Blog
P
Privacy & Cybersecurity Law Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
D
Darknet – Hacking Tools, Hacker News & Cyber Security
月光博客
月光博客
李成银的技术随笔
Spread Privacy
Spread Privacy
F
Full Disclosure
F
Fortinet All Blogs
T
The Exploit Database - CXSecurity.com
Vercel News
Vercel News
AWS News Blog
AWS News Blog
WordPress大学
WordPress大学
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
V
Visual Studio Blog
J
Java Code Geeks
博客园 - 三生石上(FineUI控件)
G
Google Developers Blog
云风的 BLOG
云风的 BLOG
博客园 - 司徒正美
Engineering at Meta
Engineering at Meta
Last Week in AI
Last Week in AI
P
Palo Alto Networks Blog
宝玉的分享
宝玉的分享
T
True Tiger Recordings
N
News and Events Feed by Topic
酷 壳 – CoolShell
酷 壳 – CoolShell
Cisco Talos Blog
Cisco Talos Blog
N
News | PayPal Newsroom
S
SegmentFault 最新的问题
Jina AI
Jina AI

DEV Community

If your AI initiative is pending for 6 months, the bottleneck is probably not technology Hermes Agent Under the Hood: The Open-Source Runtime for Autonomous AI Systems Expert Systems -The AI That Existed Before AI Was Cool AI-generated accessibility, an update — frontier models still fail, but skills change the game My HTML Learning Journey 🚀 The Day PayPal Failed and the Rust Rewrite Saved the Product Launch Google Sheets CRM: 4 Ways I've Actually Done It (with Apps Script Code) BrontoScope: AI-Powered Error Investigations The job of an AI engineer inside a 40-person company is not what most CEOs think it is 7 overlooked .Net features How Stripe Took 48 Hours and 3 API Calls to Break My Freelance Income Stream in Lagos Pretty normal Both Camps in the 'Left Behind' Argument Are Right About Each Other Flutter MCP Toolkit v3 Google Just Shipped Gemini 3.5 Flash. Here's What Developers Actually Need to Know. 🔐 Working with Private Symfony Recipes Rate limiting in web apps: what to protect before picking a library Rate limiting en aplicaciones web: qué proteger antes de elegir una librería What Are Lakehouse Catalogs? The Role of Catalogs in Apache Iceberg What It Really Takes to Become a Senior Software Engineer Microservices Were Never About Technology JS Crime Scene: The Misleading Array Project-as-code for a Directus v9 backend When the API literally burned your database after a typo COOKIES DPRK Hacking Trends 2026: AI‑Powered Supply Chain and Developer Environment Attacks Phone control for AI coding sessions is not a tiny terminal PayPal and Crypto Are Not Equals: How I Built a Gumroad Alternative for Restricted Countries Exploring Tech as a Content Writer I Raised Gemma 4's Token Cap. The Dense Model Stopped Refusing. React Server Components Don't Make Your App Fast by Default Multi-Stage Builds for a Next.js App — Reduce Image Size by 70% I Built a Chrome Extension That Teaches Vocabulary While You Browse Why I Walked Back from Next.js and RSC to a Plain SPA and a Separate Backend NeuralPocket: Private On-Device AI with Gemma 4 — Android & Web Github Speckit: Revolucionando o Desenvolvimento com SDD Cloud Cost Elasticity I Built a Payment System for Bangladesh—Heres Why Stripe Failed Us Polyglot Persistence in Microservices: Choosing the Right Database for Each Service Centralized Authentication for a Multi-Brand Laravel Ecosystem How I made a perfect recording button. Simple yet complex thing. Mumbli – my personal Wispr Flow Getting Paid Should Not Be a Geopolitical Nightmare: My NOWPayments Integration Story Four Layers of Validation in Kubernetes with Claude Code Prompt Flow — a visual side project for flow design, trace, and integration steps (looking for feedback) AI Citation Registry: Temporal Gaps in Government Publishing Cycles ShowDev: I built a 100% local, zero-upload PDF editor using WebAssembly JavaC Written by an AI Pipeline, Verified by Three Models. Is It Slop? Part1 Vulkan: Drawing Triangle 1 Why I Stopped Using useEffect to Sync State — and What I Use Instead Por qué dejé de usar useEffect para sincronizar estado y qué uso ahora Migrating a Long-Running WordPress Site to Payload CMS (And All The Chaos That Came With It) Hidden Partitioning: How Iceberg Eliminates Accidental Full Table Scans Azure DevOps Structure Explained: Organizations, Projects, and Repos Without the Mess A Simple React Hook for localStorage State, Expiry, and Sync I sold you on /scratchpad. Then I migrated to /note. Fixing WSL Errors on Windows 11 Your app is not Netflix. Stop building like it is. Resolving inter-service communication issue I built an email cleaner. CSV parsing took longer than the actual validators. How I Would Learn Full-Stack Development in 2026 If I Started From Zero Partition Evolution: Change Your Partitioning Without Rewriting Data What Google Play's I/O 2026 Updates Look Like From a Solo Indie Puzzle Developer Forgetting the Myth of "Ease of Integration" When Selling Digital Products with Bitcoin My 4-Step Regex Debugging Workflow (That Actually Saves Time) Stop Scraping Betting Sites: How to Build a Real-Time Sports Tracker in Python Civic Identity and Responsibility in Modern Democracy OLTP vs OLAP Are binaries really executable code ? The lie of the 80%: why software progress charts don't work What a Datacenter in Space Actually Buys You: Three Server Racks Is AI Actually Citing Your Site? How to Measure What Google Rankings Can't Accessibility - This looks like a job for a developer advocate! I built a Mac app that turns web pages into live widgets How to Teach Source Evaluation When Your Students Use ChatGPT More Context Does Not Mean More Trust RAG Series (24): Code RAG — Teaching AI to Understand Your Codebase Past the JVM Design decisions behind my “Irregular German Verbs” iOS app WordPress 7.0 "Armstrong" Is Live — Post-Release Deep Dive 🎺 Performance and Apache Iceberg's Metadata I Shipped a Bug to Production That Cost Us 3 Hours of Downtime 程序人生:在代码与时间之间 The Wrong Way to Think About XRPL Event Infrastructure What I Learned About MND, Voice Banking, and Why Assistive Tech Is Personal $1.50/Month Email Infrastructure That Beats Your $20 SendGrid Plan Cloud Unit Economics: The Metrics DevOps and FinOps Teams Actually Need Bypassing Payment Platform Restrictions Was The Best Decision I Ever Made For My Digital Product Business The Hidden Life of a Container: A Complete Lifecycle When a port is already in use, there is no interactive way to find it — so I built `port-peek` Como Sumir com o Barulho do Teclado Mecânico no Ubuntu Usando o NoiseTorch Google I/O 2026 dropped a bomb on Android tooling, and nobody's talking about it (or maybe they are 😅) Mentoring Junior Developers: What Actually Works How I Prevented Claude Code from Breaking My Architecture with 18 Tests That Run in 0.4 Seconds I Controlled an ESP32 Drone Using Only My Voice vite HMR is silently the reason ur laptop fan wont stop AI Agents Security for Developers: Don't Let Your Agents Become a Liability Single List Keyboard Handling 9 SaaS development companies worth knowing (a technical look)
Building a Clinical Speech-Therapy App With a Real SLP: 4 Lessons From PhoenixSteps
GaltRanch · 2026-05-21 · via DEV Community

GaltRanch

Originally published on the AstroLexis blog. Cross-posted here for the community.

My son's speech-language pathologist became my co-founder. PhoenixSteps is what came out of it: a pediatric clinical app that does what existing apps don't because we built it together — engineer plus therapist plus actual patient (also my kid). Here are four lessons from the last six months, including how we taught Apple's Vision framework to do something Apple flatly refused to.

How this started

My son has a speech sound disorder. Specifically, rotacismo — he struggles with /r/ and /rr/, which in Spanish are foundational phonemes that show up in roughly one in every six words. His speech-language pathologist is Stefania. We've been seeing her weekly for over a year and the progress has been real, but inconsistent: he'd nail a sound during a session and lose it by mid-week.

The gap was obvious to both of us. He'd do exercises with Stefa for forty minutes, then we'd go home and the exercises mostly stopped, because:

  • The "drill at home" sheet Stefa sent had no feedback loop. My kid would say "ratón" five times and have no idea if any of them were correct.
  • Existing pediatric speech-therapy apps in Spanish are either commercially mediocre (gamified versions of basic flashcards) or clinically rigid (built for adult speech rehab, not children).
  • The market for tools that actually run the clinical exercises a Spanish-speaking SLP would prescribe — with audio feedback, automatic scoring, and progress tracking the therapist can read — basically did not exist for a private practice working with a 4-year-old.

I asked Stefa if she'd want to co-design something. She said yes. That's how PhoenixSteps started — and the four lessons below are the ones I wish I'd known going in.

Lesson 1: A clinical co-creator changes everything about what you ship

I had built consumer iOS apps before. I had not built a clinical tool. The thing I underestimated was how much of the actual product is the protocol, not the software.

Stefa works from named, published clinical protocols — Borrás, Bosch, the AELFA articulation drills. When she prescribes an exercise, she's pulling from a tradition that has decades of consensus on order, dosage, and progression. "Lengua a la nariz" isn't a cute idea — it's Borrás Exercise 29, with specific instructions about duration, repetitions per day, and what to do if the child can't sustain the position.

Before working with Stefa, I would have built a "speech therapy app" that was basically a glorified flashcard deck with cute animations. With Stefa, the exercise catalog became:

  • Orofacial praxias — 7 exercises pulled directly from her clinical sheet, in the order she actually prescribes them.
  • R-group syllable warmups — "ra ra ra," "rrrr-on" — building muscle memory before tackling words.
  • R simple words — rosa, ratón, mira, perro — graded by Stefa for difficulty.
  • R-cluster words (sinfones) — bra, cra, dra, fra, gra, pra, tra. The hard ones.
  • Minimal pairs — R/RR, R/L, D/R, T/D. Auditory discrimination drills.
  • Carrier phrases — embedding the target sound in real sentences.
  • "Tren de la Risa" — a karaoke song Stefa wrote that hits every R context across 8 verses.

None of that comes out of an engineer's imagination. It comes out of a working SLP's notebook.

Lesson 1 distilled: if you're building a clinical product, the clinician is not a "domain advisor." They're a co-founder. Hire them, equity them in, give them a real voice on the product roadmap.

Lesson 2: Apple won't give you what your patient needs. Build it yourself.

This is the technical story, and it's the one I'm most proud of.

One of the most prescribed praxias for kids working on /r/ is "lengua a la nariz" — extending the tongue tip toward the nose. The exercise builds the lingual elevation needed for the alveolar trill. Stefa wants the app to automatically verify the kid did the exercise correctly: tongue out, pointed up, sustained for 10 seconds.

This sounds like a job for ARKit. Apple has had face tracking with the TrueDepth camera since the iPhone X. ARFaceAnchor.blendShapes includes jawOpen, mouthSmileLeft, cheekPuff — and yes, tongueOut.

Except: tongueOut is a scalar. It's 0 when the tongue is in, and 1 when it's out. Apple does not tell you where the tongue is pointing. Up, down, left, right — they all read identical.

I emailed Apple developer support. The answer was: no, the tongue is not modeled as 3D geometry, and there's no API to detect tongue direction. Tongue tracking is inherently unstable (occlusion by teeth and lips), so Apple chose not to ship something they couldn't validate at Face ID precision.

So Stefa and I built the detector ourselves.

The pipeline

  1. ARKit captures the camera frame on the TrueDepth camera at 60 fps.
  2. We grab the raw frame.capturedImage — the YUV pixel buffer ARKit hands you for free.
  3. Vision detects face landmarks: VNDetectFaceLandmarksRequest returns outerLips, innerLips, and nose as 2D polygons.
  4. Three Regions of Interest outside the lip polygon:
    • UP ROI — rectangle between top of upper lip and bottom of nose
    • LEFT ROI — extending leftward from the left corner of the lips
    • RIGHT ROI — same, mirrored
  5. Count pink/red pixels inside each ROI. The lip-skin transition is at Cr ≈ 18; the tongue is at Cr ≈ 25-50. We threshold Cr > 25 to filter out facial skin and pale lips.
  6. If a ROI has > 400 "tongue-colored" pixels, the tongue is projecting in that direction. Cross-check with ARKit's tongueOut blendshape, mirror-compensate for the front-facing camera.

The detector reports up, down, left, right, center, or notVisible at 20Hz with a confidence score. The first time I showed Stefa the demo — me sticking my tongue toward my nose and watching the screen say "ARRIBA conf 100% pix 3,974" — she didn't believe it was real until I sent her the source code.

Lesson 2 distilled: the most defensible technical work in a clinical product is the part Apple won't ship. If you can do something the platform doesn't expose — and it matters for the clinical outcome — that's your moat.

Lesson 3: Audio quality is a feature, not a detail

PhoenixSteps ships with about 325 pre-recorded voice prompts, all generated using OpenAI's gpt-4o-mini-tts with the "nova" voice. Why pre-recorded TTS instead of letting iOS synthesize on the fly?

  • Pediatric voice consistency. Kids learn faster when the audio prompt sounds the same every time.
  • Speed and articulation. Stefa wanted slower-than-normal pronunciation for warmups, regular pace for practice, a specific cadence for the song. Generating with explicit instructions ("habla en español neutro latinoamericano, ritmo lento y articulado, énfasis infantil sin caricaturizar") gets us the exact register a real SLP would use.
  • Reliability. Pre-recorded audio works offline, doesn't depend on a phone's TTS pipeline being up, doesn't get interrupted by Siri.

We learned the hard way that the OpenAI API will occasionally return a truncated mp3 (we caught three files at 0.36s when they should have been 1.2s). The fix was a post-generation validation step: every newly generated mp3 has to pass a minimum-duration check.

Lesson 3 distilled: for pediatric/clinical apps, audio is content. Pre-render every prompt with a consistent voice and pace. Validate audio duration before bundling.

Lesson 4: HIPAA-equivalent privacy isn't optional

The users of PhoenixSteps are children. Their voice recordings and progress data are protected health information.

  • Speech recognition on-device (WhisperKit). Voice never leaves the iPhone.
  • Face tracking on-device (ARKit + Vision).
  • Progress data in SwiftData, syncing to family's private iCloud.
  • No analytics, no third-party SDKs, no Crashlytics, no Facebook Pixel.
  • AI features gated by parental consent. Apple Foundation Models on-device, opt-in.

PhoenixSteps will never have a data breach involving children's voice samples, because there's no centralized data to breach.

Lesson 4 distilled: if you're building anything where the user is a minor or a patient, design as if the audit is happening tomorrow.

Where PhoenixSteps is right now

  • Not in the App Store yet. Build 28. Finishing the clinical pilot with Stefa.
  • Spanish-first. English localization on the roadmap once the clinical content is validated by an English-speaking SLP.
  • Free for parents, with an optional Pro tier for clinicians.
  • Stefa is a co-founder. Equity, not consulting.

If you're an SLP working with pediatric patients in Spanish, write us. We're going to add more clinical advisors as the product matures: contact@astrolexis.space.


— Bruno Galtranch, founder, AstroLexis LLC. With Stefania, SLP and co-founder.