【x264视频编码器应用与实现】八. x264编码一帧图像API:x264_encoder_encode(一)


摘要:

本文作为 “x264视频编码器应用与实现” 系列博文的第八篇,主要讨论x264编码器实例的图像编码API的实现。

一、整体结构

在example.c中,首先执行的是打开编码器操作,使用的API为x264_encoder_open,随后进行编码的主体循环结构,从输入的yuv文件中循环读取像素数据进行编码,并写出到输出的码流文件中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
for( ;; i_frame++ )
{
/* Read input frame */
if( fread( pic.img.plane[0], 1, luma_size, stdin ) != luma_size )
break;
if( fread( pic.img.plane[1], 1, chroma_size, stdin ) != chroma_size )
break;
if( fread( pic.img.plane[2], 1, chroma_size, stdin ) != chroma_size )
break;

pic.i_pts = i_frame;
i_frame_size = x264_encoder_encode( h, &nal, &i_nal, &pic, &pic_out );
if( i_frame_size < 0 )
goto fail;
else if( i_frame_size )
{
if( !fwrite( nal->p_payload, i_frame_size, 1, stdout ) )
goto fail;
}
}

从该段代码中可知,输入yuv数据的亮度分量被保存在pic.img.plane[0]中,色度分量分别被保存在pic.img.plane[1]和pic.img.plane[2]中。然后pic将作为一个参数传入x264_encoder_encode中进行编码。

下图给出x264_encoder_encode函数前半部分的内部函数调用路径图

二、x264_encoder_encode实现

x264_encoder_encode的作用是将一帧像素格式的输入图像编码为一段码流格式的NALU码流,其实现在encoder.c的3224~3797行,其声明形式如下所述:

1
2
3
4
5
6
7
/* x264_encoder_encode:
* encode one picture.
* *pi_nal is the number of NAL units outputted in pp_nal.
* returns the number of bytes in the returned NALs.
* returns negative on error and zero if no NAL units returned.
* the payloads of all output NALs are guaranteed to be sequential in memory. */
int x264_encoder_encode( x264_t *, x264_nal_t **pp_nal, int *pi_nal, x264_picture_t *pic_in, x264_picture_t *pic_out );

由于该函数是实现对一帧图像进行编码的核心过程,x264_encoder_encode的实现较为复杂。本章节逐步讨论其功能。

2.1 设置线程相关参数

x264_encoder_encode的最开始为设置线程相关的操作,其实现为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
/****************************************************************************
* x264_encoder_encode:
* XXX: i_poc : is the poc of the current given picture
* i_frame : is the number of the frame being coded
* ex: type frame poc
* I 0 2*0
* P 1 2*3
* B 2 2*1
* B 3 2*2
* P 4 2*6
* B 5 2*4
* B 6 2*5
****************************************************************************/
int x264_encoder_encode( x264_t *h,
x264_nal_t **pp_nal, int *pi_nal,
x264_picture_t *pic_in,
x264_picture_t *pic_out )
{
x264_t *thread_current, *thread_prev, *thread_oldest;
int i_nal_type, i_nal_ref_idc, i_global_qp;
int overhead = NALU_OVERHEAD;

#if HAVE_OPENCL
if( h->opencl.b_fatal_error )
return -1;
#endif

if( h->i_thread_frames > 1 )
{
thread_prev = h->thread[ h->i_thread_phase ];
h->i_thread_phase = (h->i_thread_phase + 1) % h->i_thread_frames;
thread_current = h->thread[ h->i_thread_phase ];
thread_oldest = h->thread[ (h->i_thread_phase + 1) % h->i_thread_frames ];
thread_sync_context( thread_current, thread_prev );
x264_thread_sync_ratecontrol( thread_current, thread_prev, thread_oldest );
h = thread_current;
}
else
{
thread_current =
thread_oldest = h;
}

// ......
}

回顾:设置编码线程数

其中的关键参数i_thread_frames由 validate_parameters()设置:

1
h->i_thread_frames = h->param.b_sliced_threads ? 1 : h->param.i_threads;

换言之,i_thread_frames 受 b_sliced_threads 控制:b_sliced_threads 为1,则为1;为0,则为i_threads的值(即实际线程数)。

i_threads 在x264_param_default中初始化为X264_THREADS_AUTO即0,在validate_parameters中,根据情况进行修正。以下是在validate_parameters中的实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
if( h->param.i_threads == X264_THREADS_AUTO )
{
h->param.i_threads = x264_cpu_num_processors() * (h->param.b_sliced_threads?2:3)/2;
/* Avoid too many threads as they don't improve performance and
* complicate VBV. Capped at an arbitrary 2 rows per thread. */
int max_threads = X264_MAX( 1, (h->param.i_height+15)/16 / 2 );
h->param.i_threads = X264_MIN( h->param.i_threads, max_threads );
}
int max_sliced_threads = X264_MAX( 1, (h->param.i_height+15)/16 / 4 );
if( h->param.i_threads > 1 )
{
#if !HAVE_THREAD
x264_log( h, X264_LOG_WARNING, "not compiled with thread support!\n");
h->param.i_threads = 1;
#endif
/* Avoid absurdly small thread slices as they can reduce performance
* and VBV compliance. Capped at an arbitrary 4 rows per thread. */
if( h->param.b_sliced_threads )
h->param.i_threads = X264_MIN( h->param.i_threads, max_sliced_threads );
}
h->param.i_threads = x264_clip3( h->param.i_threads, 1, X264_THREAD_MAX );
if( h->param.i_threads == 1 )
{
h->param.b_sliced_threads = 0;
h->param.i_lookahead_threads = 1;
}

对i_threads的修正:

  • 初始修正值:cpu核数的1倍(条带多线程b_sliced_threads关闭)或1.5倍(b_sliced_threads开启);
  • 如果最终只有一个线程(即i_threads为1),则关闭条带多线程并将前瞻线程设为数设为1;
  • 理论上限:不超过128;

如果最终只有一个线程(即i_threads为1),则关闭条带多线程并将前瞻线程设为数设为1;

线程上下文同步

在x264_encoder_open中分配thread句柄:

1
2
3
h->thread[0] = h;
for( int i = 1; i < h->param.i_threads + !!h->param.i_sync_lookahead; i++ )
CHECKED_MALLOC( h->thread[i], sizeof(x264_t) );

在x264_encoder_open中x264_t结构h的内存空间将被置为0,包括i_thread_phase。在x264_encoder_encode的以下代码中实现线程的上下文同步:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
if( h->i_thread_frames > 1 )
{
// thread_prev、thread_current、thread_oldest分别表示前一个、当前和循环顺序下的最后一个线程句柄。
thread_prev = h->thread[ h->i_thread_phase ];
h->i_thread_phase = (h->i_thread_phase + 1) % h->i_thread_frames;
thread_current = h->thread[ h->i_thread_phase ];
thread_oldest = h->thread[ (h->i_thread_phase + 1) % h->i_thread_frames ];
thread_sync_context( thread_current, thread_prev );
x264_thread_sync_ratecontrol( thread_current, thread_prev, thread_oldest );
h = thread_current;
}
else
{
thread_current =
thread_oldest = h;
}

thread_sync_context的实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
static void thread_sync_context( x264_t *dst, x264_t *src )
{
if( dst == src )
return;

// reference counting
for( x264_frame_t **f = src->frames.reference; *f; f++ )// src的每一个参考帧的引用计数++
(*f)->i_reference_count++;
for( x264_frame_t **f = dst->frames.reference; *f; f++ )// dst的每一个参考帧的引用计数--,到达0时加入src的unused缓存
x264_frame_push_unused( src, *f );
src->fdec->i_reference_count++;
x264_frame_push_unused( src, dst->fdec );

// copy everything except the per-thread pointers and the constants.
memcpy( &dst->i_frame, &src->i_frame, offsetof(x264_t, mb.base) - offsetof(x264_t, i_frame) );
dst->param = src->param;
dst->stat = src->stat;
dst->pixf = src->pixf;
dst->reconfig = src->reconfig;
}

thread_sync_context将前一个句柄对应的上下文实例拷贝到当前句柄。

x264_thread_sync_ratecontrol的功能类似,区别在于针对的是码率控制相关的参数。

2.2 保存传入的图像数据

在x264_encoder_encode中需要将外部传入的图像保存至编码器内部,其实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
int     x264_encoder_encode( x264_t *h,
x264_nal_t **pp_nal, int *pi_nal,
x264_picture_t *pic_in,
x264_picture_t *pic_out )
{
// ......
/* ------------------- Setup new frame from picture -------------------- */
if( pic_in != NULL )
{
if( h->lookahead->b_exit_thread )
{
x264_log( h, X264_LOG_ERROR, "lookahead thread is already stopped\n" );
return -1;
}

/* 1: Copy the picture to a frame and move it to a buffer */
x264_frame_t *fenc = x264_frame_pop_unused( h, 0 );
if( !fenc )
return -1;

if( x264_frame_copy_picture( h, fenc, pic_in ) < 0 )
return -1;

if( h->param.i_width != 16 * h->mb.i_mb_width ||
h->param.i_height != 16 * h->mb.i_mb_height )
x264_frame_expand_border_mod16( h, fenc );

// 设置当前帧的帧序号
fenc->i_frame = h->frames.i_input++;

// 设置当前帧的时间戳
if( fenc->i_frame == 0 )
h->frames.i_first_pts = fenc->i_pts;
if( h->frames.i_bframe_delay && fenc->i_frame == h->frames.i_bframe_delay )
h->frames.i_bframe_delay_time = fenc->i_pts - h->frames.i_first_pts;

if( h->param.b_vfr_input && fenc->i_pts <= h->frames.i_largest_pts )
x264_log( h, X264_LOG_WARNING, "non-strictly-monotonic PTS\n" );

h->frames.i_second_largest_pts = h->frames.i_largest_pts;
h->frames.i_largest_pts = fenc->i_pts;

// 确定i_pic_struct的值在合法范围
if( (fenc->i_pic_struct < PIC_STRUCT_AUTO) || (fenc->i_pic_struct > PIC_STRUCT_TRIPLE) )
fenc->i_pic_struct = PIC_STRUCT_AUTO;

if( fenc->i_pic_struct == PIC_STRUCT_AUTO )
{
#if HAVE_INTERLACED
int b_interlaced = fenc->param ? fenc->param->b_interlaced : h->param.b_interlaced;
#else
int b_interlaced = 0;
#endif
if( b_interlaced )
{
int b_tff = fenc->param ? fenc->param->b_tff : h->param.b_tff;
fenc->i_pic_struct = b_tff ? PIC_STRUCT_TOP_BOTTOM : PIC_STRUCT_BOTTOM_TOP;
}
else
fenc->i_pic_struct = PIC_STRUCT_PROGRESSIVE; // 图像结构为逐行扫描
}

if( h->param.rc.b_mb_tree && h->param.rc.b_stat_read )
{
if( x264_macroblock_tree_read( h, fenc, pic_in->prop.quant_offsets ) )
return -1;
}
else
// 为当前帧的每一个mc设置qp_offset, 处理自适应量化相关的配置
x264_adaptive_quant_frame( h, fenc, pic_in->prop.quant_offsets );

// ......
}
// ......
}

从上述实现可知,保存外部图像数据需要以下步骤:

  1. x264_frame_pop_unused,从内存缓存区中取出一帧图像存储结构对象;
  2. x264_frame_copy_picture,将外部图像数据复制到内部图像缓存结构中;
  3. x264_frame_expand_border_mod16,确保图像宽高都为16像素的倍数;

x264_frame_pop_unused

x264_frame_pop_unused函数的实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
x264_frame_t *x264_frame_pop_unused( x264_t *h, int b_fdec )
{
x264_frame_t *frame;
if( h->frames.unused[b_fdec][0] ) // 上层传入的b_fdec值为0;
frame = x264_frame_pop( h->frames.unused[b_fdec] );
else
frame = frame_new( h, b_fdec );
if( !frame )
return NULL;
frame->b_last_minigop_bframe = 0;
frame->i_reference_count = 1;
frame->b_intra_calculated = 0;
frame->b_scenecut = 1;
frame->b_keyframe = 0;
frame->b_corrupt = 0;
frame->i_slice_count = h->param.b_sliced_threads ? h->param.i_threads : 1;

memset( frame->weight, 0, sizeof(frame->weight) );
memset( frame->f_weighted_cost_delta, 0, sizeof(frame->f_weighted_cost_delta) );

return frame;
}

该函数的主要作用是获取一个x264_frame_t类型的frame,并返回给调用者。获取一个frame根据当前空余frame缓存的情况,用以下两种方法之一获取frame:

  • x264_frame_pop:从frame缓存中获取一个frame结构;
  • frame_new:新建一个frame结构;

x264_frame_pop的实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
x264_frame_t *x264_frame_pop( x264_frame_t **list )
{
// 这里的list == h->frames.unused[0]
x264_frame_t *frame;
int i = 0;
assert( list[0] );

// 找到数组h->frames.unused[0][n]中最后一个非空的结构,并返回给调用者。
while( list[i+1] ) i++;
frame = list[i];
list[i] = NULL;
return frame;
}

frame_new的实现相当的长,具体如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
static x264_frame_t *frame_new( x264_t *h, int b_fdec )
{
x264_frame_t *frame;
int i_csp = frame_internal_csp( h->param.i_csp );
int i_mb_count = h->mb.i_mb_count;
int i_stride, i_width, i_lines, luma_plane_count;
int i_padv = PADV << PARAM_INTERLACED;
int align = 16;
#if ARCH_X86 || ARCH_X86_64
if( h->param.cpu&X264_CPU_CACHELINE_64 || h->param.cpu&X264_CPU_AVX512 )
align = 64;
else if( h->param.cpu&X264_CPU_CACHELINE_32 || h->param.cpu&X264_CPU_AVX )
align = 32;
#endif
#if ARCH_PPC
int disalign = 1<<9;
#else
int disalign = 1<<10;
#endif

/* ensure frame alignment after PADH is added */
int padh_align = X264_MAX( align - PADH * sizeof(pixel), 0 ) / sizeof(pixel);

CHECKED_MALLOCZERO( frame, sizeof(x264_frame_t) );
PREALLOC_INIT

/* allocate frame data (+64 for extra data for me) */
i_width = h->mb.i_mb_width*16;
i_lines = h->mb.i_mb_height*16;
// 保证stride为16的整数倍
i_stride = align_stride( i_width + 2*PADH, align, disalign );

//针对不同的颜色空间格式设定参数
if( i_csp == X264_CSP_NV12 || i_csp == X264_CSP_NV16 )
{
luma_plane_count = 1;
frame->i_plane = 2;
for( int i = 0; i < 2; i++ )
{
frame->i_width[i] = i_width >> i;
frame->i_lines[i] = i_lines >> (i && i_csp == X264_CSP_NV12);
frame->i_stride[i] = i_stride;
}
}
else if( i_csp == X264_CSP_I444 )
{
luma_plane_count = 3;
frame->i_plane = 3;
for( int i = 0; i < 3; i++ )
{
frame->i_width[i] = i_width;
frame->i_lines[i] = i_lines;
frame->i_stride[i] = i_stride;
}
}
else if( i_csp == X264_CSP_I400 )
{
luma_plane_count = 1;
frame->i_plane = 1;
frame->i_width[0] = i_width;
frame->i_lines[0] = i_lines;
frame->i_stride[0] = i_stride;
}
else
goto fail;

frame->i_csp = i_csp;
frame->i_width_lowres = frame->i_width[0]/2;
frame->i_lines_lowres = frame->i_lines[0]/2;
frame->i_stride_lowres = align_stride( frame->i_width_lowres + 2*PADH, align, disalign<<1 );

for( int i = 0; i < h->param.i_bframe + 2; i++ )
for( int j = 0; j < h->param.i_bframe + 2; j++ )
PREALLOC( frame->i_row_satds[i][j], i_lines/16 * sizeof(int) );

// 设置frame其它参数的初始值
frame->i_poc = -1;
frame->i_type = X264_TYPE_AUTO;
frame->i_qpplus1 = X264_QP_AUTO;
frame->i_pts = -1;
frame->i_frame = -1;
frame->i_frame_num = -1;
frame->i_lines_completed = -1;
frame->b_fdec = b_fdec;
frame->i_pic_struct = PIC_STRUCT_AUTO;
frame->i_field_cnt = -1;
frame->i_duration =
frame->i_cpb_duration =
frame->i_dpb_output_delay =
frame->i_cpb_delay = 0;
frame->i_coded_fields_lookahead =
frame->i_cpb_delay_lookahead = -1;

frame->orig = frame;

// 分配图像的色度分量空间
if( i_csp == X264_CSP_NV12 || i_csp == X264_CSP_NV16 )
{
int chroma_padv = i_padv >> (i_csp == X264_CSP_NV12);
int chroma_plane_size = (frame->i_stride[1] * (frame->i_lines[1] + 2*chroma_padv));
PREALLOC( frame->buffer[1], (chroma_plane_size + padh_align) * sizeof(pixel) );
if( PARAM_INTERLACED )
PREALLOC( frame->buffer_fld[1], (chroma_plane_size + padh_align) * sizeof(pixel) );
}

/* all 4 luma planes allocated together, since the cacheline split code
* requires them to be in-phase wrt cacheline alignment. */
for( int p = 0; p < luma_plane_count; p++ )
{
int luma_plane_size = align_plane_size( frame->i_stride[p] * (frame->i_lines[p] + 2*i_padv), disalign );
if( h->param.analyse.i_subpel_refine && b_fdec )
luma_plane_size *= 4;

/* FIXME: Don't allocate both buffers in non-adaptive MBAFF. */
PREALLOC( frame->buffer[p], (luma_plane_size + padh_align) * sizeof(pixel) );
if( PARAM_INTERLACED )
PREALLOC( frame->buffer_fld[p], (luma_plane_size + padh_align) * sizeof(pixel) );
}

frame->b_duplicate = 0;

// 当前创建的frame用于编码过程中的重建图像帧
if( b_fdec ) /* fdec frame */
{
PREALLOC( frame->mb_type, i_mb_count * sizeof(int8_t) );
PREALLOC( frame->mb_partition, i_mb_count * sizeof(uint8_t) );
PREALLOC( frame->mv[0], 2*16 * i_mb_count * sizeof(int16_t) );
PREALLOC( frame->mv16x16, 2*(i_mb_count+1) * sizeof(int16_t) );
PREALLOC( frame->ref[0], 4 * i_mb_count * sizeof(int8_t) );
if( h->param.i_bframe )
{
PREALLOC( frame->mv[1], 2*16 * i_mb_count * sizeof(int16_t) );
PREALLOC( frame->ref[1], 4 * i_mb_count * sizeof(int8_t) );
}
else
{
frame->mv[1] = NULL;
frame->ref[1] = NULL;
}
PREALLOC( frame->i_row_bits, i_lines/16 * sizeof(int) );
PREALLOC( frame->f_row_qp, i_lines/16 * sizeof(float) );
PREALLOC( frame->f_row_qscale, i_lines/16 * sizeof(float) );
if( h->param.analyse.i_me_method >= X264_ME_ESA )
PREALLOC( frame->buffer[3], frame->i_stride[0] * (frame->i_lines[0] + 2*i_padv) * sizeof(uint16_t) << h->frames.b_have_sub8x8_esa );
if( PARAM_INTERLACED )
PREALLOC( frame->field, i_mb_count * sizeof(uint8_t) );
if( h->param.analyse.b_mb_info )
PREALLOC( frame->effective_qp, i_mb_count * sizeof(uint8_t) );
}
// 当前创建的frame用于编码的输入帧
else /* fenc frame */
{
// 低分辨率模式,亮度分量的长宽均为正常模式的一半
if( h->frames.b_have_lowres )
{
int luma_plane_size = align_plane_size( frame->i_stride_lowres * (frame->i_lines[0]/2 + 2*PADV), disalign );

PREALLOC( frame->buffer_lowres, (4 * luma_plane_size + padh_align) * sizeof(pixel) );

for( int j = 0; j <= !!h->param.i_bframe; j++ )
for( int i = 0; i <= h->param.i_bframe; i++ )
{
PREALLOC( frame->lowres_mvs[j][i], 2*h->mb.i_mb_count*sizeof(int16_t) );
PREALLOC( frame->lowres_mv_costs[j][i], h->mb.i_mb_count*sizeof(int) );
}
PREALLOC( frame->i_propagate_cost, i_mb_count * sizeof(uint16_t) );
for( int j = 0; j <= h->param.i_bframe+1; j++ )
for( int i = 0; i <= h->param.i_bframe+1; i++ )
PREALLOC( frame->lowres_costs[j][i], i_mb_count * sizeof(uint16_t) );

/* mbtree asm can overread the input buffers, make sure we don't read outside of allocated memory. */
prealloc_size += NATIVE_ALIGN;
}
// 针对“自适应QP”的rc模式,设置qp相关的参数的空间
if( h->param.rc.i_aq_mode )
{
PREALLOC( frame->f_qp_offset, h->mb.i_mb_count * sizeof(float) );
PREALLOC( frame->f_qp_offset_aq, h->mb.i_mb_count * sizeof(float) );
if( h->frames.b_have_lowres )
PREALLOC( frame->i_inv_qscale_factor, (h->mb.i_mb_count+3) * sizeof(uint16_t) );
}
}

PREALLOC_END( frame->base );

if( i_csp == X264_CSP_NV12 || i_csp == X264_CSP_NV16 )
{
// 将色度 frame->buffer 对应到 frame->plane
int chroma_padv = i_padv >> (i_csp == X264_CSP_NV12);
frame->plane[1] = frame->buffer[1] + frame->i_stride[1] * chroma_padv + PADH + padh_align;
if( PARAM_INTERLACED )
frame->plane_fld[1] = frame->buffer_fld[1] + frame->i_stride[1] * chroma_padv + PADH + padh_align;
}

for( int p = 0; p < luma_plane_count; p++ )
{
int luma_plane_size = align_plane_size( frame->i_stride[p] * (frame->i_lines[p] + 2*i_padv), disalign );
if( h->param.analyse.i_subpel_refine && b_fdec )
{
for( int i = 0; i < 4; i++ )
{
frame->filtered[p][i] = frame->buffer[p] + i*luma_plane_size + frame->i_stride[p] * i_padv + PADH + padh_align;
frame->filtered_fld[p][i] = frame->buffer_fld[p] + i*luma_plane_size + frame->i_stride[p] * i_padv + PADH + padh_align;
}
frame->plane[p] = frame->filtered[p][0];
frame->plane_fld[p] = frame->filtered_fld[p][0];
}
else
{
// 将亮度 frame->buffer 对应到 frame->plane
frame->filtered[p][0] = frame->plane[p] = frame->buffer[p] + frame->i_stride[p] * i_padv + PADH + padh_align;
frame->filtered_fld[p][0] = frame->plane_fld[p] = frame->buffer_fld[p] + frame->i_stride[p] * i_padv + PADH + padh_align;
}
}

if( b_fdec )
{
M32( frame->mv16x16[0] ) = 0;
frame->mv16x16++;

if( h->param.analyse.i_me_method >= X264_ME_ESA )
frame->integral = (uint16_t*)frame->buffer[3] + frame->i_stride[0] * i_padv + PADH;
}
else
{
// 半分辨率下的图像存储空间
if( h->frames.b_have_lowres )
{
int luma_plane_size = align_plane_size( frame->i_stride_lowres * (frame->i_lines[0]/2 + 2*PADV), disalign );
for( int i = 0; i < 4; i++ )
frame->lowres[i] = frame->buffer_lowres + frame->i_stride_lowres * PADV + PADH + padh_align + i * luma_plane_size;

for( int j = 0; j <= !!h->param.i_bframe; j++ )
for( int i = 0; i <= h->param.i_bframe; i++ )
memset( frame->lowres_mvs[j][i], 0, 2*h->mb.i_mb_count*sizeof(int16_t) );

frame->i_intra_cost = frame->lowres_costs[0][0];
memset( frame->i_intra_cost, -1, (i_mb_count+3) * sizeof(uint16_t) );

if( h->param.rc.i_aq_mode )
/* shouldn't really be initialized, just silences a valgrind false-positive in x264_mbtree_propagate_cost_sse2 */
memset( frame->i_inv_qscale_factor, 0, (h->mb.i_mb_count+3) * sizeof(uint16_t) );
}
}

// 初始化线程锁
if( x264_pthread_mutex_init( &frame->mutex, NULL ) )
goto fail;
if( x264_pthread_cond_init( &frame->cv, NULL ) )
goto fail;

#if HAVE_OPENCL
frame->opencl.ocl = h->opencl.ocl;
#endif

return frame;

fail:
x264_free( frame );
return NULL;
}

x264_frame_copy_picture

该函数的声明如下:

1
int x264_frame_copy_picture( x264_t *h, x264_frame_t *dst, x264_picture_t *src );

所实现的功能很容易理解,即将 x264_picture_t 类型的输入参数 src 中的数据复制到 x264_frame_t 类型的参数 dst 中。其实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
int x264_frame_copy_picture( x264_t *h, x264_frame_t *dst, x264_picture_t *src )
{
// 验证颜色空间
int i_csp = src->img.i_csp & X264_CSP_MASK;
if( dst->i_csp != frame_internal_csp( i_csp ) )
{
x264_log( h, X264_LOG_ERROR, "Invalid input colorspace\n" );
return -1;
}

#if HIGH_BIT_DEPTH
if( !(src->img.i_csp & X264_CSP_HIGH_DEPTH) )
{
x264_log( h, X264_LOG_ERROR, "This build of x264 requires high depth input. Rebuild to support 8-bit input.\n" );
return -1;
}
#else
if( src->img.i_csp & X264_CSP_HIGH_DEPTH )
{
x264_log( h, X264_LOG_ERROR, "This build of x264 requires 8-bit input. Rebuild to support high depth input.\n" );
return -1;
}
#endif

if( BIT_DEPTH != 10 && i_csp == X264_CSP_V210 )
{
x264_log( h, X264_LOG_ERROR, "v210 input is only compatible with bit-depth of 10 bits\n" );
return -1;
}

if( src->i_type < X264_TYPE_AUTO || src->i_type > X264_TYPE_KEYFRAME )
{
x264_log( h, X264_LOG_WARNING, "forced frame type (%d) at %d is unknown\n", src->i_type, h->frames.i_input );
dst->i_forced_type = X264_TYPE_AUTO;
}
else
dst->i_forced_type = src->i_type;

// 设置帧类型与关键信息赋值:
dst->i_type = dst->i_forced_type;
dst->i_qpplus1 = src->i_qpplus1;
dst->i_pts = dst->i_reordered_pts = src->i_pts;
dst->param = src->param;
dst->i_pic_struct = src->i_pic_struct;
dst->extra_sei = src->extra_sei;
dst->opaque = src->opaque;
dst->mb_info = h->param.analyse.b_mb_info ? src->prop.mb_info : NULL;
dst->mb_info_free = h->param.analyse.b_mb_info ? src->prop.mb_info_free : NULL;

// 根据颜色空间模式,拷贝像素:
uint8_t *pix[3];
int stride[3];
if( i_csp == X264_CSP_YUYV || i_csp == X264_CSP_UYVY )
{
int p = i_csp == X264_CSP_UYVY;
h->mc.plane_copy_deinterleave_yuyv( dst->plane[p], dst->i_stride[p], dst->plane[p^1], dst->i_stride[p^1],
(pixel*)src->img.plane[0], src->img.i_stride[0], h->param.i_width, h->param.i_height );
}
else if( i_csp == X264_CSP_V210 )
{
stride[0] = src->img.i_stride[0];
pix[0] = src->img.plane[0];

h->mc.plane_copy_deinterleave_v210( dst->plane[0], dst->i_stride[0],
dst->plane[1], dst->i_stride[1],
(uint32_t *)pix[0], stride[0]/sizeof(uint32_t), h->param.i_width, h->param.i_height );
}
else if( i_csp >= X264_CSP_BGR )
{
stride[0] = src->img.i_stride[0];
pix[0] = src->img.plane[0];
if( src->img.i_csp & X264_CSP_VFLIP )
{
pix[0] += (h->param.i_height-1) * stride[0];
stride[0] = -stride[0];
}
int b = i_csp==X264_CSP_RGB;
h->mc.plane_copy_deinterleave_rgb( dst->plane[1+b], dst->i_stride[1+b],
dst->plane[0], dst->i_stride[0],
dst->plane[2-b], dst->i_stride[2-b],
(pixel*)pix[0], stride[0]/sizeof(pixel), i_csp==X264_CSP_BGRA ? 4 : 3, h->param.i_width, h->param.i_height );
}
else
{
int v_shift = CHROMA_V_SHIFT;
get_plane_ptr( h, src, &pix[0], &stride[0], 0, 0, 0 );
h->mc.plane_copy( dst->plane[0], dst->i_stride[0], (pixel*)pix[0],
stride[0]/sizeof(pixel), h->param.i_width, h->param.i_height );
if( i_csp == X264_CSP_NV12 || i_csp == X264_CSP_NV16 )
{
get_plane_ptr( h, src, &pix[1], &stride[1], 1, 0, v_shift );
h->mc.plane_copy( dst->plane[1], dst->i_stride[1], (pixel*)pix[1],
stride[1]/sizeof(pixel), h->param.i_width, h->param.i_height>>v_shift );
}
else if( i_csp == X264_CSP_NV21 )
{
get_plane_ptr( h, src, &pix[1], &stride[1], 1, 0, v_shift );
h->mc.plane_copy_swap( dst->plane[1], dst->i_stride[1], (pixel*)pix[1],
stride[1]/sizeof(pixel), h->param.i_width>>1, h->param.i_height>>v_shift );
}
else if( i_csp == X264_CSP_I420 || i_csp == X264_CSP_I422 || i_csp == X264_CSP_YV12 || i_csp == X264_CSP_YV16 )
{
int uv_swap = i_csp == X264_CSP_YV12 || i_csp == X264_CSP_YV16;
get_plane_ptr( h, src, &pix[1], &stride[1], uv_swap ? 2 : 1, 1, v_shift );
get_plane_ptr( h, src, &pix[2], &stride[2], uv_swap ? 1 : 2, 1, v_shift );
h->mc.plane_copy_interleave( dst->plane[1], dst->i_stride[1],
(pixel*)pix[1], stride[1]/sizeof(pixel),
(pixel*)pix[2], stride[2]/sizeof(pixel),
h->param.i_width>>1, h->param.i_height>>v_shift );
}
else if( i_csp == X264_CSP_I444 || i_csp == X264_CSP_YV24 )
{
get_plane_ptr( h, src, &pix[1], &stride[1], i_csp==X264_CSP_I444 ? 1 : 2, 0, 0 );
get_plane_ptr( h, src, &pix[2], &stride[2], i_csp==X264_CSP_I444 ? 2 : 1, 0, 0 );
h->mc.plane_copy( dst->plane[1], dst->i_stride[1], (pixel*)pix[1],
stride[1]/sizeof(pixel), h->param.i_width, h->param.i_height );
h->mc.plane_copy( dst->plane[2], dst->i_stride[2], (pixel*)pix[2],
stride[2]/sizeof(pixel), h->param.i_width, h->param.i_height );
}
}
return 0;
}

x264_frame_expand_border_mod16

该函数的作用是对 x264_frame_t 结构实例 frame 的 plane 存储空间进行扩展,使其宽高都可以被16整除。具体实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
void x264_frame_expand_border_mod16( x264_t *h, x264_frame_t *frame )
{
for( int i = 0; i < frame->i_plane; i++ )
{
int i_width = h->param.i_width;
int h_shift = i && CHROMA_H_SHIFT;
int v_shift = i && CHROMA_V_SHIFT;
int i_height = h->param.i_height >> v_shift;
int i_padx = (h->mb.i_mb_width * 16 - h->param.i_width);
int i_pady = (h->mb.i_mb_height * 16 - h->param.i_height) >> v_shift;

if( i_padx )
{
for( int y = 0; y < i_height; y++ )
pixel_memset( &frame->plane[i][y*frame->i_stride[i] + i_width],
&frame->plane[i][y*frame->i_stride[i] + i_width - 1-h_shift],
i_padx>>h_shift, sizeof(pixel)<<h_shift );
}
if( i_pady )
{
for( int y = i_height; y < i_height + i_pady; y++ )
memcpy( &frame->plane[i][y*frame->i_stride[i]],
&frame->plane[i][(i_height-(~y&PARAM_INTERLACED)-1)*frame->i_stride[i]],
(i_width + i_padx) * sizeof(pixel) );
}
}
}

2.3 将图像保存到队列

当从空闲图像结构队列中获取到一个空的 frame 结构,并且将输入图像的数据复制到其中后,下一步需要将这个 frame 结构保存到图像队列中,并进行 slice 类型的判断。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int     x264_encoder_encode( x264_t *h,
x264_nal_t **pp_nal, int *pi_nal,
x264_picture_t *pic_in,
x264_picture_t *pic_out )
{
// ......
/* 2: Place the frame into the queue for its slice type decision */
x264_lookahead_put_frame( h, fenc );

if( h->frames.i_input <= h->frames.i_delay + 1 - h->i_thread_frames )
{
/* Nothing yet to encode, waiting for filling of buffers */
pic_out->i_type = X264_TYPE_AUTO;
return 0;
}
// ......
}

其中 x264_lookahead_put_frame的实现非常简单:

1
2
3
4
5
6
7
void x264_lookahead_put_frame( x264_t *h, x264_frame_t *frame )
{
if( h->param.i_sync_lookahead )
x264_sync_frame_list_push( &h->lookahead->ifbuf, frame );
else
x264_sync_frame_list_push( &h->lookahead->next, frame );
}

在 tune 取值为zerolatency时,h->param.i_sync_lookahead的默认取值为0,因此,在 x264_lookahead_put_frame 中会执行

  • x264_sync_frame_list_push( &h->lookahead->next, frame )

x264_sync_frame_list_push的实现实际上非常简单:

1
2
3
4
5
6
7
8
9
void x264_sync_frame_list_push( x264_sync_frame_list_t *slist, x264_frame_t *frame )
{
x264_pthread_mutex_lock( &slist->mutex );
while( slist->i_size == slist->i_max_size )
x264_pthread_cond_wait( &slist->cv_empty, &slist->mutex );
slist->list[ slist->i_size++ ] = frame;
x264_pthread_mutex_unlock( &slist->mutex );
x264_pthread_cond_broadcast( &slist->cv_fill );
}

从上述代码的调用中可知,x264_lookahead_put_frame的设计功能就是讲 fenc 添加到 h->lookahead->next->list[ x ] 中。

随后的判断中,h->frames.i_input表示了当前已输入的视频图像帧数,i_input 在 x264_encoder_open 中初始化为0,并在每次调用 x264_encoder_encode 时自增1。当读入的图像数量未达到下限时,x264_encoder_encode 将直接返回。

2.4 分析帧信息

x264_encoder_encode 下一步要执行的操作是对当前帧进行分析,其主要任务包括:

  1. 判断当前帧的编码类型;
  2. 计算当前帧的duration;

在该函数内的实现为:

1
2
3
4
5
6
7
h->i_frame++;
/* 3: The picture is analyzed in the lookahead */
if( !h->frames.current[0] )
x264_lookahead_get_frames( h );

if( !h->frames.current[0] && x264_lookahead_is_empty( h ) )
return encoder_frame_end( thread_oldest, thread_current, pp_nal, pi_nal, pic_out );

其中,x264_lookahead_get_frames的实现为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
void x264_lookahead_get_frames( x264_t *h )
{
if( h->param.i_sync_lookahead )
{ /* We have a lookahead thread, so get frames from there */
x264_pthread_mutex_lock( &h->lookahead->ofbuf.mutex );
while( !h->lookahead->ofbuf.i_size && h->lookahead->b_thread_active )
x264_pthread_cond_wait( &h->lookahead->ofbuf.cv_fill, &h->lookahead->ofbuf.mutex );
lookahead_encoder_shift( h );
x264_pthread_mutex_unlock( &h->lookahead->ofbuf.mutex );
}
else
{ /* We are not running a lookahead thread, so perform all the slicetype decide on the fly */

if( h->frames.current[0] || !h->lookahead->next.i_size )
return;

x264_slicetype_decide( h );
lookahead_update_last_nonb( h, h->lookahead->next.list[0] );
int shift_frames = h->lookahead->next.list[0]->i_bframes + 1;
lookahead_shift( &h->lookahead->ofbuf, &h->lookahead->next, shift_frames );

/* For MB-tree and VBV lookahead, we have to perform propagation analysis on I-frames too. */
if( h->lookahead->b_analyse_keyframe && IS_X264_TYPE_I( h->lookahead->last_nonb->i_type ) )
x264_slicetype_analyse( h, shift_frames );

lookahead_encoder_shift( h );
}
}

该函数中最核心的部分为 x264_slicetype_decide,其实现为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
void x264_slicetype_decide( x264_t *h )
{
x264_frame_t *frames[X264_BFRAME_MAX+2];
x264_frame_t *frm;
int bframes;
int brefs;

if( !h->lookahead->next.i_size )
return;

int lookahead_size = h->lookahead->next.i_size;

for( int i = 0; i < h->lookahead->next.i_size; i++ )
{
if( h->param.b_vfr_input ) //如果b_vfr_input为1,根据timestamps和timebase进行码率控制;否则根据fps
{
if( lookahead_size-- > 1 )
h->lookahead->next.list[i]->i_duration = 2 * (h->lookahead->next.list[i+1]->i_pts - h->lookahead->next.list[i]->i_pts);
else
h->lookahead->next.list[i]->i_duration = h->i_prev_duration;
}
else
// delta_tfi_divisor[10] = { 0, 2, 1, 1, 2, 2, 3, 3, 4, 6 };
// 计算当前帧的duration(以timebase和sec为单位)
h->lookahead->next.list[i]->i_duration = delta_tfi_divisor[h->lookahead->next.list[i]->i_pic_struct];
h->i_prev_duration = h->lookahead->next.list[i]->i_duration;
h->lookahead->next.list[i]->f_duration = (double)h->lookahead->next.list[i]->i_duration
* h->sps->vui.i_num_units_in_tick
/ h->sps->vui.i_time_scale;

if( h->lookahead->next.list[i]->i_frame > h->i_disp_fields_last_frame && lookahead_size > 0 )
{
h->lookahead->next.list[i]->i_field_cnt = h->i_disp_fields;
h->i_disp_fields += h->lookahead->next.list[i]->i_duration;
h->i_disp_fields_last_frame = h->lookahead->next.list[i]->i_frame;
}
else if( lookahead_size == 0 )
{
h->lookahead->next.list[i]->i_field_cnt = h->i_disp_fields;
h->lookahead->next.list[i]->i_duration = h->i_prev_duration;
}
}

if( h->param.rc.b_stat_read )
{
/* Use the frame types from the first pass */
for( int i = 0; i < h->lookahead->next.i_size; i++ )
h->lookahead->next.list[i]->i_type =
x264_ratecontrol_slice_type( h, h->lookahead->next.list[i]->i_frame );
}
else if( (h->param.i_bframe && h->param.i_bframe_adaptive)
|| h->param.i_scenecut_threshold
|| h->param.rc.b_mb_tree
|| (h->param.rc.i_vbv_buffer_size && h->param.rc.i_lookahead) )
x264_slicetype_analyse( h, 0 );

// 设置编码帧类型
for( bframes = 0, brefs = 0;; bframes++ )
{
frm = h->lookahead->next.list[bframes];

// 以下几个if结构对一些不兼容的帧类型进行调整
if( frm->i_forced_type != X264_TYPE_AUTO && frm->i_type != frm->i_forced_type &&
!(frm->i_forced_type == X264_TYPE_KEYFRAME && IS_X264_TYPE_I( frm->i_type )) )
{
x264_log( h, X264_LOG_WARNING, "forced frame type (%d) at %d was changed to frame type (%d)\n",
frm->i_forced_type, frm->i_frame, frm->i_type );
}

if( frm->i_type == X264_TYPE_BREF && h->param.i_bframe_pyramid < X264_B_PYRAMID_NORMAL &&
brefs == h->param.i_bframe_pyramid )
{
frm->i_type = X264_TYPE_B;
x264_log( h, X264_LOG_WARNING, "B-ref at frame %d incompatible with B-pyramid %s \n",
frm->i_frame, x264_b_pyramid_names[h->param.i_bframe_pyramid] );
}
/* pyramid with multiple B-refs needs a big enough dpb that the preceding P-frame stays available.
smaller dpb could be supported by smart enough use of mmco, but it's easier just to forbid it. */
else if( frm->i_type == X264_TYPE_BREF && h->param.i_bframe_pyramid == X264_B_PYRAMID_NORMAL &&
brefs && h->param.i_frame_reference <= (brefs+3) )
{
frm->i_type = X264_TYPE_B;
x264_log( h, X264_LOG_WARNING, "B-ref at frame %d incompatible with B-pyramid %s and %d reference frames\n",
frm->i_frame, x264_b_pyramid_names[h->param.i_bframe_pyramid], h->param.i_frame_reference );
}

if( frm->i_type == X264_TYPE_KEYFRAME )
frm->i_type = h->param.b_open_gop ? X264_TYPE_I : X264_TYPE_IDR;

/* Limit GOP size */
if( (!h->param.b_intra_refresh || frm->i_frame == 0) && frm->i_frame - h->lookahead->i_last_keyframe >= h->param.i_keyint_max )
{
// 判断 I 帧和 IDR 帧
if( frm->i_type == X264_TYPE_AUTO || frm->i_type == X264_TYPE_I )
frm->i_type = h->param.b_open_gop && h->lookahead->i_last_keyframe >= 0 ? X264_TYPE_I : X264_TYPE_IDR;
int warn = frm->i_type != X264_TYPE_IDR;
if( warn && h->param.b_open_gop )
warn &= frm->i_type != X264_TYPE_I;
if( warn )
{
x264_log( h, X264_LOG_WARNING, "specified frame type (%d) at %d is not compatible with keyframe interval\n", frm->i_type, frm->i_frame );
frm->i_type = h->param.b_open_gop && h->lookahead->i_last_keyframe >= 0 ? X264_TYPE_I : X264_TYPE_IDR;
}
}
if( frm->i_type == X264_TYPE_I && frm->i_frame - h->lookahead->i_last_keyframe >= h->param.i_keyint_min )
{
// 当前帧被设定为I帧,且间隔大于最小关键帧设置
if( h->param.b_open_gop )
{
h->lookahead->i_last_keyframe = frm->i_frame; // Use display order
if( h->param.b_bluray_compat )
h->lookahead->i_last_keyframe -= bframes; // Use bluray order
frm->b_keyframe = 1;
}
else
frm->i_type = X264_TYPE_IDR;
}
if( frm->i_type == X264_TYPE_IDR )
{
/* Close GOP */
h->lookahead->i_last_keyframe = frm->i_frame;//记录当前帧为最近的一个IDR帧
frm->b_keyframe = 1;
if( bframes > 0 )
{
bframes--;
h->lookahead->next.list[bframes]->i_type = X264_TYPE_P;
}
}

if( bframes == h->param.i_bframe ||
!h->lookahead->next.list[bframes+1] )
{
if( IS_X264_TYPE_B( frm->i_type ) )
x264_log( h, X264_LOG_WARNING, "specified frame type is not compatible with max B-frames\n" );
if( frm->i_type == X264_TYPE_AUTO
|| IS_X264_TYPE_B( frm->i_type ) )
frm->i_type = X264_TYPE_P;
}

if( frm->i_type == X264_TYPE_BREF )
brefs++;

if( frm->i_type == X264_TYPE_AUTO )
frm->i_type = X264_TYPE_B;

else if( !IS_X264_TYPE_B( frm->i_type ) ) break;
}

if( bframes )
h->lookahead->next.list[bframes-1]->b_last_minigop_bframe = 1;
h->lookahead->next.list[bframes]->i_bframes = bframes;

/* insert a bref into the sequence */
if( h->param.i_bframe_pyramid && bframes > 1 && !brefs )
{
h->lookahead->next.list[(bframes-1)/2]->i_type = X264_TYPE_BREF;
brefs++;
}

/* calculate the frame costs ahead of time for x264_rc_analyse_slice while we still have lowres */
// 计算该帧的编码代价
// 如果采用CQP模式,量化参数固定,不执行该部分操作
if( h->param.rc.i_rc_method != X264_RC_CQP )
{
x264_mb_analysis_t a;
int p0, p1, b;
p1 = b = bframes + 1;

lowres_context_init( h, &a );

frames[0] = h->lookahead->last_nonb;
memcpy( &frames[1], h->lookahead->next.list, (bframes+1) * sizeof(x264_frame_t*) );
if( IS_X264_TYPE_I( h->lookahead->next.list[bframes]->i_type ) )
p0 = bframes + 1;
else // P
p0 = 0;

// 估算当前帧的编码代价,内部会逐层调用slicetype_slice_cost和slicetype_mb_cost
slicetype_frame_cost( h, &a, frames, p0, p1, b );

if( (p0 != p1 || bframes) && h->param.rc.i_vbv_buffer_size )
{
/* We need the intra costs for row SATDs. */
slicetype_frame_cost( h, &a, frames, b, b, b );

/* We need B-frame costs for row SATDs. */
p0 = 0;
for( b = 1; b <= bframes; b++ )
{
if( frames[b]->i_type == X264_TYPE_B )
for( p1 = b; frames[p1]->i_type == X264_TYPE_B; )
p1++;
else
p1 = bframes + 1;
slicetype_frame_cost( h, &a, frames, p0, p1, b );
if( frames[b]->i_type == X264_TYPE_BREF )
p0 = b;
}
}
}

/* Analyse for weighted P frames */
if( !h->param.rc.b_stat_read && h->lookahead->next.list[bframes]->i_type == X264_TYPE_P
&& h->param.analyse.i_weighted_pred >= X264_WEIGHTP_SIMPLE )
{
x264_emms();
x264_weights_analyse( h, h->lookahead->next.list[bframes], h->lookahead->last_nonb, 0 );
}

/* shift sequence to coded order.
use a small temporary list to avoid shifting the entire next buffer around */
int i_coded = h->lookahead->next.list[0]->i_frame;
if( bframes )
{
int idx_list[] = { brefs+1, 1 };
for( int i = 0; i < bframes; i++ )
{
int idx = idx_list[h->lookahead->next.list[i]->i_type == X264_TYPE_BREF]++;
frames[idx] = h->lookahead->next.list[i];
frames[idx]->i_reordered_pts = h->lookahead->next.list[idx]->i_pts;
}
frames[0] = h->lookahead->next.list[bframes];
frames[0]->i_reordered_pts = h->lookahead->next.list[0]->i_pts;
memcpy( h->lookahead->next.list, frames, (bframes+1) * sizeof(x264_frame_t*) );
}

for( int i = 0; i <= bframes; i++ )
{
h->lookahead->next.list[i]->i_coded = i_coded++;
if( i )
{
calculate_durations( h, h->lookahead->next.list[i], h->lookahead->next.list[i-1], &h->i_cpb_delay, &h->i_coded_fields );
h->lookahead->next.list[0]->f_planned_cpb_duration[i-1] = (double)h->lookahead->next.list[i]->i_cpb_duration *
h->sps->vui.i_num_units_in_tick / h->sps->vui.i_time_scale;
}
else
calculate_durations( h, h->lookahead->next.list[i], NULL, &h->i_cpb_delay, &h->i_coded_fields );
}
}

在该函数的最后一个for循环中调用了calculate_durations方法计算当前cpb中总的的duration,实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
static void calculate_durations( x264_t *h, x264_frame_t *cur_frame, x264_frame_t *prev_frame, int64_t *i_cpb_delay, int64_t *i_coded_fields )
{
cur_frame->i_cpb_delay = *i_cpb_delay;
cur_frame->i_dpb_output_delay = cur_frame->i_field_cnt - *i_coded_fields;

// add a correction term for frame reordering
cur_frame->i_dpb_output_delay += h->sps->vui.i_num_reorder_frames*2;

// fix possible negative dpb_output_delay because of pulldown changes and reordering
if( cur_frame->i_dpb_output_delay < 0 )
{
cur_frame->i_cpb_delay += cur_frame->i_dpb_output_delay;
cur_frame->i_dpb_output_delay = 0;
if( prev_frame )
prev_frame->i_cpb_duration += cur_frame->i_dpb_output_delay;
}

// don't reset cpb delay for IDR frames when using intra-refresh
if( cur_frame->b_keyframe && !h->param.b_intra_refresh )
*i_cpb_delay = 0;

// 把当前帧的 duration 添加到 cpb 的总 duration 中。
*i_cpb_delay += cur_frame->i_duration;
*i_coded_fields += cur_frame->i_duration;
cur_frame->i_cpb_duration = cur_frame->i_duration;
}

PS: x264_frame_t 结构中有两个duration,二者代表的单位不同,需注意区分:

  • i_duration:以sps中的time_base为单位;
  • f_duration:以自然时间的秒为单位;

Author: Yin Wenjie
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source Yin Wenjie !
 Previous
【x264视频编码器应用与实现】九. x264编码一帧图像API:x264_encoder_encode(二) 【x264视频编码器应用与实现】九. x264编码一帧图像API:x264_encoder_encode(二)
摘要:本文作为 “x264视频编码器应用与实现” 系列博文的第九篇,继续讨论x264编码器实例的图像编码API的实现。
2022-09-29 Yin Wenjie
Next 
【x264视频编码器应用与实现】七. x264结构句柄和编码器对象 【x264视频编码器应用与实现】七. x264结构句柄和编码器对象
摘要:本文作为 “x264视频编码器应用与实现” 系列博文的第七篇,主要讨论x264编码器实例的参数初始化过程。
2022-04-02 Yin Wenjie
  TOC