fix(udt): declare audio stream before avformat_write_header()#392
fix(udt): declare audio stream before avformat_write_header()#392matteius merged 1 commit intoopensensor:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes detection-triggered (UDT) MP4 recordings missing an audio track by ensuring the audio stream is declared before avformat_write_header() seals the MP4 header.
Changes:
- Add
pending_audio_codecpartomp4_writer_tto stage audio codec parameters for header-time stream registration. - Eagerly prepare AAC codec parameters in
udt_start_recording()so UDT can declare the audio stream before writing the MP4 header. - Consume
pending_audio_codecparinmp4_writer_initialize()to create the audio stream prior toavio_open/avformat_write_header(), and free it afterward.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/video/unified_detection_thread.c |
Precomputes audio codec parameters during UDT recording startup for use during MP4 initialization. |
src/video/mp4_writer_utils.c |
Declares an audio stream from staged codec parameters before writing the MP4 header. |
src/video/mp4_writer_core.c |
Initializes and frees the new staged codec-parameters field in writer lifecycle. |
include/video/mp4_writer.h |
Extends mp4_writer_t with pending_audio_codecpar for UDT header-time audio registration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
e080ea5 to
71de980
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| a_stream->codecpar->codec_tag = 0; | ||
| a_stream->time_base = (AVRational){1, writer->pending_audio_codecpar->sample_rate}; | ||
| writer->audio.stream_idx = a_stream->index; |
There was a problem hiding this comment.
a_stream->time_base is set using writer->pending_audio_codecpar->sample_rate without validating it. If sample_rate is 0/unknown (possible for some pass-through audio codecpar), this creates an invalid {1,0} time base and can make avformat_write_header() fail or produce broken timestamps. Please guard for sample_rate > 0 and fall back to a safe default (or store/carry the input audio time_base from UDT and use that).
| // Pending AAC codec parameters set by udt_start_recording() before the first | ||
| // packet arrives. mp4_writer_initialize() consumes this to declare the audio | ||
| // stream BEFORE avformat_write_header() is called — the only legal window for | ||
| // adding streams to an MP4 container. NULL after consumption or on error. | ||
| AVCodecParameters *pending_audio_codecpar; |
There was a problem hiding this comment.
The field comment says "Pending AAC codec parameters", but udt_start_recording() can also populate pending_audio_codecpar with a deep-copy of already MP4-compatible non-AAC params (e.g., native AAC/MP3/etc.). Updating this comment to describe it as "pending audio codec parameters" (not necessarily AAC) will avoid misleading future changes.
| // Eagerly determine the output codec parameters so mp4_writer_initialize() | ||
| // can declare the audio stream BEFORE avformat_write_header() — the only | ||
| // legal window for adding streams to an MP4 container. | ||
| // | ||
| // The pending params must exactly match what mp4_writer_write_packet() will | ||
| // later mux: | ||
| // * PCM variants -> transcode to AAC (mp4_writer_write_packet transcodes) | ||
| // * MP4-compatible -> deep-copy as-is (packets pass through unchanged) | ||
| // * anything else -> disable audio (not supported) |
There was a problem hiding this comment.
This change fixes a previously silent audio-drop condition, but there’s no automated regression test covering the UDT/detection MP4 path to ensure an audio stream is declared before avformat_write_header() and the output contains both video+audio streams. Since CI runs FFmpeg-linked C tests, consider adding a small regression test that exercises this UDT init path (or mp4_writer_initialize with pending_audio_codecpar) and asserts 2 streams (e.g., via writer->output_ctx->nb_streams or ffprobe).
There was a problem hiding this comment.
done, but not tested
aa8cc7d to
4fc3d71
Compare
Detection-triggered MP4 recordings had no audio track because the audio stream was registered after avformat_write_header() — illegal for MP4 containers. Fixes opensensor#352. transcode_pcm_to_aac() is now called eagerly in udt_start_recording() and the resulting AVCodecParameters stored as pending_audio_codecpar. mp4_writer_initialize() consumes this before writing the header. Only the UDT path is affected; record_segment is unchanged.
4fc3d71 to
23d51d9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| } else { | ||
| // Incompatible and not PCM: cannot mux or transcode. | ||
| is_audio_codec_compatible_with_mp4(ain->codecpar->codec_id, &codec_name); |
There was a problem hiding this comment.
is_audio_codec_compatible_with_mp4() is called again here only to populate codec_name, but it was already evaluated in the preceding else if condition (and sets codec_name as a side effect even when returning false). Consider removing this redundant call and just log using the existing codec_name value.
| is_audio_codec_compatible_with_mp4(ain->codecpar->codec_id, &codec_name); |
| writer->is_rotating = 0; // Initialize rotation flag | ||
| writer->shutdown_component_id = -1; // Initialize to -1 to indicate not registered | ||
| writer->pending_audio_codecpar = NULL; // Populated by udt_start_recording(), consumed by mp4_writer_initialize() | ||
| writer->pending_audio_time_base = (AVRational){0, 1}; // Set alongside pending_audio_codecpar |
There was a problem hiding this comment.
pending_audio_time_base is initialized to {0, 1}, which looks “valid” to checks like den > 0 but represents an invalid 0 time base. Consider initializing to {0, 0} (FFmpeg’s common "unknown" rational) or another clearly-invalid sentinel, and ensure callers validate both num/den before using it.
| writer->pending_audio_time_base = (AVRational){0, 1}; // Set alongside pending_audio_codecpar | |
| writer->pending_audio_time_base = (AVRational){0, 0}; // Unknown/unset until provided alongside pending_audio_codecpar |
Fixes #352 — Detection-triggered MP4 recordings contain no audio track despite audio being enabled and the transcoder initialising successfully.
Problem
When ONVIF motion detection triggers a recording via the Unified Detection Thread (UDT), the resulting MP4 file contains only a video track. The log shows the audio transcoder initialising correctly, which makes the root cause non-obvious:
ffprobeon the output file confirms the issue:Root cause
The MP4 container format requires all streams to be declared before
avformat_write_header()is called. Once the header is written the stream table is sealed — any stream added afterwards is ignored by the muxer.The UDT recording path violated this constraint through a lazy-init design:
Every subsequent audio packet — from the pre-buffer flush, live recording, and post-buffer — is transcoded correctly but then discarded because
writer->audio.stream_idxis still-1.This affects only the UDT path (
mp4_writer_initialize). The scheduled recording path (record_segment) already handles audio registration correctly before writing the header and is unaffected.Solution
Declare the audio stream eagerly — before
avformat_write_header()— by splitting the work into two steps:Step 1 —
udt_start_recording(): call the statelesstranscode_pcm_to_aac()immediately after the writer is created to obtain the AACAVCodecParameters(including the MPEG-4 Audio Specific Config /extradata). Store the result in a newpending_audio_codecparfield on the writer.Step 2 —
mp4_writer_initialize(): in the video-keyframe branch, check forpending_audio_codecparand register the audio stream viaavformat_new_stream()beforeavio_open/avformat_write_header()are called. Consume and free the pending params immediately after.transcode_pcm_to_aac()is fully stateless (local alloc/free, no globalaudio_transcoders[]access), so calling it here does not interfere with the lazyinit_audio_transcoder()that runs during packet processing.Changes
include/video/mp4_writer.hAVCodecParameters *pending_audio_codecpartostruct mp4_writersrc/video/mp4_writer_core.cNULLinmp4_writer_create(); free inmp4_writer_close()src/video/unified_detection_thread.ctranscode_pcm_to_aac()eagerly inudt_start_recording()src/video/mp4_writer_utils.cpending_audio_codecparbeforeavformat_write_header()inmp4_writer_initialize()No changes to
record_segment,mp4_writer_thread,mp4_recording_core, or any other recording path.Testing
Hardware: Tapo C545D Dual-Lens, go2rtc RTSP proxy, Docker.
Before:
1 streams: ['video']After:
2 streams: ['video', 'audio']New log entries confirm correct sequencing: