多声道正确downmix到双声道的方法
吐槽:
有空去翻了翻vgmstream的C代码,发现如下段落:
void mixing_macro_downmix(VGMSTREAM* vgmstream, int max /*, mapping_t output_mapping*/) {
mixer_t* mixer = vgmstream->mixer;
int output_channels, mp_in, mp_out, ch_in, ch_out;
channel_mapping_t input_mapping, output_mapping;
const double vol_max = 1.0;
const double vol_sqrt = 1 / sqrt(2);
const double vol_half = 1 / 2;
double matrix[16][16] = {{0}};
if (!mixer)
return;
if (max <= 1 || mixer->output_channels <= max || max >= 8)
return;
/* assume WAV defaults if not set */
input_mapping = vgmstream->channel_layout;
if (input_mapping == 0) {
switch(mixer->output_channels) {
case 1: input_mapping = mapping_MONO; break;
case 2: input_mapping = mapping_STEREO; break;
case 3: input_mapping = mapping_2POINT1; break;
case 4: input_mapping = mapping_QUAD; break;
case 5: input_mapping = mapping_5POINT0; break;
case 6: input_mapping = mapping_5POINT1; break;
case 7: input_mapping = mapping_7POINT0; break;
case 8: input_mapping = mapping_7POINT1; break;
default: return;
}
}
/* build mapping matrix[input channel][output channel] = volume,
* using standard WAV/AC3 downmix formulas
* - Documentation | Audiokinetic
* - Documentation | Audiokinetic
*/
switch(max) {
case 1:
output_mapping = mapping_MONO;
matrix[pos_FL][pos_FC] = vol_sqrt;
matrix[pos_FR][pos_FC] = vol_sqrt;
matrix[pos_FC][pos_FC] = vol_max;
matrix[pos_SL][pos_FC] = vol_half;
matrix[pos_SR][pos_FC] = vol_half;
matrix[pos_BL][pos_FC] = vol_half;
matrix[pos_BR][pos_FC] = vol_half;
break;
case 2:
output_mapping = mapping_STEREO;
matrix[pos_FL][pos_FL] = vol_max;
matrix[pos_FR][pos_FR] = vol_max;
matrix[pos_FC][pos_FL] = vol_sqrt;
matrix[pos_FC][pos_FR] = vol_sqrt;
matrix[pos_SL][pos_FL] = vol_sqrt;
matrix[pos_SR][pos_FR] = vol_sqrt;
matrix[pos_BL][pos_FL] = vol_sqrt;
matrix[pos_BR][pos_FR] = vol_sqrt;
break;
default:
/* not sure if +3ch would use FC/LFE, SL/BR and whatnot without passing extra config, so ignore for now */
return;
}
/* save and make N fake channels at the beginning for easier calcs */
output_channels = mixer->output_channels;
for (int ch = 0; ch < max; ch++) {
mixing_push_upmix(vgmstream, 0);
}
/* downmix */
ch_in = 0;
for (mp_in = 0; mp_in < 16; mp_in++) {
/* read input mapping (ex. 5.1) and find channel */
if (!(input_mapping & (1 << mp_in)))
continue;
ch_out = 0;
for (mp_out = 0; mp_out < 16; mp_out++) {
/* read output mapping (ex. 2.0) and find channel */
if (!(output_mapping & (1 << mp_out)))
continue;
mixing_push_add(vgmstream, ch_out, max + ch_in, matrix[mp_in][mp_out]);
ch_out++;
if (ch_out > max)
break;
}
ch_in++;
if (ch_in >= output_channels)
break;
}
/* remove unneeded channels */
mixing_push_killmix(vgmstream, max);
}
首先定义3个downmix的音量变量:
const double vol_max = 1.0; // 100%音量
const double vol_sqrt = 1 / sqrt(2); // 70.71%音量
const double vol_half = 1 / 2; // 50%音量
根据 case 2 可知有如下的downmix表:
| 输入通道 | 左声道系数 | 右声道系数 |
|---|---|---|
| 前左(FL) | 1.0 | 0.0 |
| 前右(FR) | 0.0 | 1.0 |
| 前中(FC) | 0.7071 | 0.7071 |
| 侧左(SL) | 0.7071 | 0.0 |
| 侧右(SR) | 0.0 | 0.7071 |
| 后左(BL) | 0.7071 | 0.0 |
| 后右(BR) | 0.0 | 0.7071 |
| 低频效果(LFE) | 0.0 | 0.0 |
ffmpeg -i <input_file> -af "pan=stereo|FL=1.0*FL+0.7071*FC+0.7071*SL+0.7071*BL|FR=1.0*FR+0.7071*FC+0.7071*SR+0.7071*BR" <output_file>
至于为什么Adobe Audition里面套用这个downmix表,生成的结果不一致?个人猜测是Adobe Audition会进行压限(响度匹配就有这个毛病,一堆小动作)