---
title: "Algorithmic Dysfluency: Why AI Cannot Hear the Stammering Subject"
author: "Rantideb Howlader"
date: "2026-02-05T00:00:00.000Z"
canonical_url: "https://www.ranti.dev/blog/algorithmic-dysfluency"
license: "CC-BY-4.0"
---


## I. Introduction: The Error Rate of Existence

"To stammer is to interrupt the flow of capital." - Investigational Thesis

When I attempt to interface with my computational devices, the system engages in a process of severe temporal disciplining, listening for precisely 1.5 seconds, and if I do not produce a phoneme within that rigorously defined window, it assumes the request is finished and unceremoniously terminates the session because it cannot conceive of a speaker who requires time to think.

I stammer, and my speech is characterized by **Temporal Gaps** - silences, repetitions, and blocks that defy the standard Wait Time heuristics of Voice User Interfaces (VUIs); consequently, to Siri, Alexa, and GPT-4o, my silence is empty, interpreted as a `null` value in the audio stream, whereas to me, that silence is profoundly full of meaning, representing the sound of my cognitive labor, the friction of my thought process, and the audible artifact of my survival in a hostile temporal architecture.

This disjuncture - between my **Biological Clock** and the **Algorithmic Clock** - is not a mere technical bug to be patched but a fundamental **epistemological exclusion**. **Epistemological Violence** is the disregard for, and the silencing of, the knowledge systems of marginalized groups. The Artificial Intelligence industry is building what **Neil Postman** would call a **Technopoly of Fluency**, enforcing a medicalized hegemony of smoothness that renders disabled bodies legally and technologically illegible.

In this critique, **I argue** that the Stammer is not a biological defect to be cured, but a **Temporal Resistance** - a "Glitch" in the capitalist production of time that reveals the hegemony of the normative clock. **I posit** that until we build systems that can listen to the silence, we are building systems that only listen to power.

### 1.1 The Technical Audit: Why Whisper Fails

To move beyond auto-ethnographic anecdote, let us examine the underlying architecture of these exclusion mechanisms. State-of-the-art ASR models (like **OpenAI Whisper**) are trained on massive datasets of clean speech (e.g., audiobooks, podcasts, YouTube videos) which have been sanitized of human imperfection. These datasets are fundamentally **Discriminatory**:

1.  **Selection Bias:** Podcasts are edited to remove "ums," "ahs," and stammers, representing a Super-Fluency that exists nowhere in natural human biology.
2.  **The Tokenizer:** Models like BERT and GPT utilize **Byte-Pair Encoding (BPE)**, which is optimized to predict the next most probable token, effectively punishing the low-probability sequence of a stammer (e.g., "wh-wh-what").
3.  **The Context Window:** When Whisper encounters a block (silence), it often hallucinates a completed sentence to minimize the **Word Error Rate (WER)**, preferring to invent a coherent lie rather than transcribe a broken truth.

This is **Algorithmic Dysmorphia**. The machine smoothes out my stutter in the transcript to fix me, but in doing so, it erases the texture of my voice. As discussed in my previous work on [Glitch Poetics](/blog/glitch-poetics-disability), this erasure is a form of digital eugenics.

## II. The Theory: Stammering as Temporal Refusal

We must ground this technical failure in **Critical Theory** and **Phenomenology**, asking why the machine hates the pause, because, as **Franco "Bifo" Berardi** argues, modern capitalism is **Semio-Capitalism** - the production of value through the high-speed exchange of signs (info-labor); in this economy, **Time is Liquidity**. Attention is the scarcest commodity, and any hesitation, any **Latency**, is simply friction that lowers the rate of profit. **Marshall McLuhan** famously codified that "the medium is the message," and in the age of AI, the medium is **Velocity**, conveying the message that "Intelligence equals Speed," and thus the stammerer, by breaking the velocity, breaks the message itself.

### 2.1 The Stammer as a "Glitch"

In my analysis of [The Physics of Trauma](/blog/trauma-time-physics), I established that trauma creates a distinct inertial reference frame, and building upon that, I argue here that the **Stammer is a Temporal Glitch**, signifying that the stammering subject refuses to synchronize with the Master Clock of neoliberalism.

- **The Fluent Speaker:** Delivers information in linear time ($t \rightarrow t+1$).
- **The Stammerer:** Delivers information in recursive time ($t \rightarrow t \rightarrow t \rightarrow t+1$).

Society pathologizes this recursion as anxiety or stupidity, but **I propose** we reframe it as **Resistance**, because when I block on a word, I am forcing the listener (and the machine) to **Wait**, thereby reclaiming the present moment from the relentless rush of the future, and in a world of Instant Answers (Perplexity, ChatGPT), the ability to pause - to not answer instantly - is a radical political act of **Hermeneutics**.

### 2.2 The Orphan's Code (Auto-Ethnography)

My life has been defined by two great interruptions, which constitute the genealogy of my resistance: the first was the loss of my mother when I was six years old, at which point I became an orphan in a world that did not care to explain itself to me, forcing me to earn my bread and butter before I could read, and in that state of high-stakes survival, I learned that **Speed Kills** - if you answer too fast, you agree to things you simply do not understand.

We must reframe this not as failure, but as what **Donna Haraway** might call a **Cyborg Resistance**, for Haraway taught us that the cyborg is "oppositional, utopian, and completely without innocence," and the stammering subject is the ultimate cyborg - half biological struggle, half technical interface. The accident that took my fluency forced me to live in **Crip Time** (Kafer), and while I work to regain my speech, I have found that the machines I work with (Siri, Whisper) have no patience for my recovery; they demand I be "fixed" now, enforcing a **Frantz Fanon**-esque Zone of Non-Being upon the dysfluent, such that to the machine, I am not a user, but an unhandled exception.

## III. The Architecture of Erasure

If the stammer is a form of resistance, the current AI stack is a machine of containment and erasure. The industry's failure to accommodate dysfluency is not merely an oversight in Edge Case testing; it is a fundamental architectural decision to prioritize **Order** over **Integrity** and **Teleology**.

### 3.1 The Clean Data Fallacy

Deep Learning models are only as good as their ground truth, yet the industry relies on datasets like Common Voice or LibriSpeech, which are rigorously scrubbed of disfluencies. Engineers treat stammers, repetitions, and pauses as **Noise** to be filtered out before training, assuming that the purified signal contains the truth.
This is a **Category Error**. For the stammering subject, the noise is the signal. By training models to predict the next fluent token, we are effectively training them to hallucinate a normative speaker where none exists. We are building systems that prefer a coherent lie to a broken truth, echoing the theology of code I explored in [Theology of Code](/blog/theology-of-code).

### 3.2 The Tyranny of the Timeout

Consider the End-of-Speech (EOS) detection logic in modern Voice User Interfaces (VUIs), where the standard threshold is often set between 700ms and 1500ms. This is an arbitrary temporal border that enforces a Normative Pace of thought upon the entire user base.
When the stammerer hits a block, they are not done; they are actively working to produce speech. But the machine interprets silence as absence. By hard-coding these timeout thresholds, developers have inadvertently encoded a Time-Out mechanism that deports the dysfluent user from the digital public square every time they pause to breathe.

### 3.3 Towards Integrity Retention

The Goal of ASR is currently defined as minimizing the **Word Error Rate (WER)**, a metric that assumes that the perfect transcript is one that matches a fluent script.
A more humane technology would optimize for **Integrity Retention**. It would ask: Does the transcript capture the texture of the speech? If a user fights through a 10-second block to say I... I... I love you, a "fluent" transcript that reads "I love you" has destroyed the emotional labor of the utterance. A truly smart AI would learn to listen to the struggle, not just the syntax.

## IV. Conclusion: The Right to Lag

**Édouard Glissant** demanded the Right to Opacity - the right not to be understood or transparent to the colonial gaze.
I demand the **Right to Lag**.
The right to process information at my own speed, without penalty.
The right to be out of sync with the centralized server of capitalist production.

As AI becomes the Universal Interface for banking, healthcare, and law, we face an imminent civil rights crisis. If these systems cannot hear the stammering subject, then the stammering subject will be locked out of the digital polity entirely.
We are building a world where only the fluent are citizens, and the dysfluent are relegated to the margins.
My research is a refusal of that world.
I am here to glitch the timeline.

### Bibliography

- **Berardi, Franco "Bifo".** The Soul at Work: From Alienation to Autonomy. Semiotext(e), 2009.
- **Benjamin, Ruha.** Race After Technology: Abolitionist Tools for the New Jim Code. Polity, 2019.
- **Fanon, Frantz.** Black Skin, White Masks. Grove Press, 1967.
- **Glissant, Édouard.** Poetics of Relation. University of Michigan Press, 1997.
- **Haraway, Donna.** Simians, Cyborgs, and Women: The Reinvention of Nature. Routledge, 1991.
- **Kafer, Alison.** Feminist, Queer, Crip. Indiana University Press, 2013.
- **McLuhan, Marshall.** Understanding Media: The Extensions of Man. MIT Press, 1964.
- **Postman, Neil.** Technopoly: The Surrender of Culture to Technology. Vintage, 1993.
- **Whittaker, Meredith.** "The Steep Cost of Compute." Interactions 27, no. 6 (2020).
- **Hugging Face.** "The State of AI 2024: Open Source vs. Closed Models." (2024).
- **OpenAI.** "GPT-4o System Card: Safety and Limitations." (2024).
- **Anthropic.** "Claude 3.5 Sonnet: System Card and Safety Evaluation." (2025).

---

### Key Definitions

**Algorithmic Dysfluency** refers to the systemic and architectural failure of AI speech recognition models to accurately recognize, transcribe, and accommodate non-normative speech patterns such as stammering, stuttering, and aphasia, resulting in the exclusion of disabled users.

**Crip Technoscience** is an interdisciplinary field of study that merges Critical Disability Studies with Science and Technology Studies (STS) to challenge the ableist assumptions embedded in the design of technological systems and to propose crip-centric alternatives.

**Temporal Glitch** refers to a theoretical framework arguing that the "glitch" or the "stammer" is not a failure of the system, but a form of political resistance that disrupts the accelerated "time is money" logic of neoliberal capitalism.

**Semio-Capitalism** is a concept developed by Franco "Bifo" Berardi describing a mode of production where the creation of value is driven by the accumulation and exchange of signs, information, and attention rather than material goods.

**Integrity Retention** is a proposed metric for evaluating AI transcription systems that prioritizes the preservation of the speaker's original dysfluency, struggle, and emotional texture over the smoothing or sanitizing of speech into normative fluency.


---

<!-- METADATA_START -->
## Metadata & Citations

### Further Reading
- [Glitch Poetics: The Political Ontology of the Refused Body](https://www.ranti.dev/blog/glitch-poetics-disability.md)
- [Reproducible Research Notebooks for Digital Humanities](https://www.ranti.dev/blog/reproducible-research-notebooks-digital-humanities.md)
- [Digital Humanities Methods: A Comparison Guide](https://www.ranti.dev/blog/digital-humanities-methods-comparison-guide.md)

### Navigation
- [Back to Bio Hub](https://www.ranti.dev/.md)
- [Full Site Manifest](https://www.ranti.dev/llms.txt)

```json
{
  "@context": "https://schema.org",
  "@type": "TechArticle",
  "headline": "Algorithmic Dysfluency: Why AI Cannot Hear the Stammering Subject",
  "author": {
    "@type": "Person",
    "name": "Rantideb Howlader"
  },
  "datePublished": "2026-02-05T00:00:00.000Z",
  "url": "https://www.ranti.dev/blog/algorithmic-dysfluency",
  "license": "https://creativecommons.org/licenses/by/4.0/",
  "isAccessibleForFree": true
}
```

### BibTeX
```bibtex
@article{algorithmic-dysfluency_2026,
  author = {Rantideb Howlader},
  title = {Algorithmic Dysfluency: Why AI Cannot Hear the Stammering Subject},
  journal = {Rantideb Howlader Portfolio},
  year = {2026},
  url = {https://www.ranti.dev/blog/algorithmic-dysfluency},
  note = {Accessed: 2026-05-31}
}
```

### IEEE
Rantideb Howlader, "Algorithmic Dysfluency: Why AI Cannot Hear the Stammering Subject," Rantideb Howlader Portfolio, 2026. [Online]. Available: https://www.ranti.dev/blog/algorithmic-dysfluency. [Accessed: 2026-05-31].

### APA
Rantideb Howlader. (2026). Algorithmic Dysfluency: Why AI Cannot Hear the Stammering Subject. Rantideb Howlader. Retrieved from https://www.ranti.dev/blog/algorithmic-dysfluency

--- 
*This content is provided in research-grade Markdown format. Required Attribution: Cite as Rantideb Howlader (2026).*
<!-- METADATA_END -->