fix(amd): make detection thresholds configurable and fix short-greeting voicemail misclassification by octo-patch · Pull Request #5490 · livekit/agents

octo-patch · 2026-04-20T02:22:53Z

Problem

The AMD classifier's short-greeting fast path (on_user_speech_ended) emitted a HUMAN verdict unconditionally whenever speech duration was <= HUMAN_SPEECH_THRESHOLD (2.5 s) followed by >= HUMAN_SILENCE_THRESHOLD (0.5 s) of silence -- even when transcript text had already been delivered via push_text() (meaning an LLM classification was already in flight).

This caused voicemail greetings that paused mid-sentence (e.g. ~2.33 s speech / 528 ms silence) to be misclassified as HUMAN regardless of what the transcript said.

Solution

Two changes:

1. Make detection thresholds configurable

HUMAN_SPEECH_THRESHOLD, HUMAN_SILENCE_THRESHOLD, and MACHINE_SILENCE_THRESHOLD are now keyword arguments on both _AMDClassifier and the public AMD class, so callers can tune detection behaviour without patching module-level constants.

2. Fix the short-greeting fast path

When _classify_task is already running at the time on_user_speech_ended is called (indicating that transcript text has arrived), the fast-path HUMAN verdict is skipped. Instead the code falls through to the LLM path using machine_silence_threshold, so the LLM can classify the greeting from the available transcript.

Testing

The fix is consistent with the existing flow: push_text() already creates _classify_task, so detecting _classify_task is not None reliably indicates that transcript evidence is available. No new runtime dependencies are introduced.

…il misclassification Fixes livekit#5477 Two related changes in the AMD classifier: 1. Expose HUMAN_SPEECH_THRESHOLD, HUMAN_SILENCE_THRESHOLD, and MACHINE_SILENCE_THRESHOLD as keyword arguments on both _AMDClassifier and the public AMD class. 2. Fix the short-greeting fast path: when speech ends within human_speech_threshold, the classifier previously emitted HUMAN unconditionally after a brief silence, ignoring transcript text that had already arrived via push_text(). When _classify_task is already running, the code now falls through to the LLM path so short voicemail greetings (e.g. paused mid-sentence at 2.3 s / 528 ms silence) are no longer misclassified as HUMAN.

CLAassistant · 2026-04-20T02:23:00Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

octo-patch seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

devin-ai-integration Bot reviewed Apr 20, 2026

View reviewed changes

chenghao-mou self-assigned this Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(amd): make detection thresholds configurable and fix short-greeting voicemail misclassification#5490

fix(amd): make detection thresholds configurable and fix short-greeting voicemail misclassification#5490
octo-patch wants to merge 1 commit intolivekit:mainfrom
octo-patch:fix/issue-5477-amd-configurable-thresholds

octo-patch commented Apr 20, 2026

Uh oh!

CLAassistant commented Apr 20, 2026

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

octo-patch commented Apr 20, 2026

Problem

Solution

Testing

Uh oh!

CLAassistant commented Apr 20, 2026

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants