使用C语言调用FFmpeg

使用FFmpeg来进行固定时间间隔截图（rtmp直播流）

在我上面的博客中总结了一些通过CMD调用FFmpeg来完成截图的方法，里面的方法虽然进化得越来越好，但是始终是不完美的。

为了达到基本可控的截图，那么就需要自己能够控制整个编解码过程，那么就只能通过代码的方式来进行了。

由于github上Golang包装好的的FFmpeg库，个人感觉并不好用，或者说并不比直接写C语言好用，所以这里决定直接使用C语言。

FFmpeg编解码的过程我主要参考的下面的博客：

FFmpeg编解码处理1-转码全流程简介

ffmpeg的C语言编程入门

FFmpeg编解码过程参考上面博客即可。

这里要进行截图主要的流程是：输入->解封装->找到单个keyPacket->解码->frame转换->编码为packet->封装->输出。

FFmpeg编程环境

FFmpeg下载与配置（Visual Studio）

这里我没有使用C++来写，因为出现一堆我解决不了的问题…所以直接使用C来写的（也就是.c文件而不是.cpp）。

代码1

// 参考 https://www.cnblogs.com/leisure_chn/p/10584901.html
// https://blog.csdn.net/lightfish_zhang/article/details/86594694

#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>

// 一些全局变量
char nameBuffer[256], intBuffer[128];
char* infilename, * outfilename_head;
int inWidth, inHeight;
int outWidth, outHeight;    // 输出的高宽，可以通过输入参数来自己定义
int pts;

// 一些统计用的全局变量
int picture_num = 1000;     // 一共截多少张图，可以通过输入参数来自己定义
int time_interval = 1;       // 间隔多少时间进行一次截图，可以通过输入参数来自己定义，但是下限是GOP的大小，设置比GOP小就等于GOP
int picture_count = 0, pkt_count = 0, key_pkt_count = 0, out_num = 0;

// 解码相关变量
AVFormatContext* ifmt_ctx = NULL;
AVCodecContext* deCodecCtx = NULL;
AVCodec* deCodec = NULL;
AVPacket* dePkt = NULL;
AVFrame* deFrame = NULL;
int video_stream_index;

// 编码相关变量
AVFormatContext* ofmt_ctx = NULL;
AVOutputFormat* ofmt = NULL;
AVCodecContext* enCodecCtx = NULL;
AVStream* ostream = NULL;
AVCodec* enCodec = NULL;
AVPacket* enPkt = NULL;
AVFrame* enframe = NULL;
struct SwsContext* sws_ctx = NULL;

// 初始化解码相关变量
int InitDecodeParam() {
	int ret;

	/**********************************************************************/
	/*********************** 分配输入 AVFormatContext **********************/
	ifmt_ctx = avformat_alloc_context();
	ret = avformat_open_input(&ifmt_ctx, infilename, NULL, NULL);
	if (ret < 0) {
		fprintf(stderr, "Could not open input file '%s'.\n", infilename);
		return ret;
	}

	/**********************************************************************/
	/**************************** 读取输入流的信息 *************************/
	ret = avformat_find_stream_info(ifmt_ctx, NULL);
	if (ret < 0) {
		fprintf(stderr, "Failed to retrieve input stream information.\n");
		return ret;
	}

	/**********************************************************************/
	/****************************** 打印输入信息 ***************************/
	fprintf(stdout, "--------------------------------------\n");
	av_dump_format(ifmt_ctx, 0, infilename, 0);
	fprintf(stdout, "--------------------------------------\n");

	/**********************************************************************/
	/****************** 使用 av_find_best_stream 找到视频流 ****************/
	video_stream_index = av_find_best_stream(ifmt_ctx, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0);
	if (video_stream_index < 0) {
		fprintf(stderr, "Could not find video stream.\n");
		return -1;
	}

	/**********************************************************************/
	/******************************** 找到解码器 ***************************/
	deCodec = avcodec_find_decoder(ifmt_ctx->streams[video_stream_index]->codecpar->codec_id);
	if (!deCodec) {
		fprintf(stderr, "Can not find decodec\n");
		return -1;
	}

	/**********************************************************************/
	/*************************** 分配解码器 context ************************/
	deCodecCtx = avcodec_alloc_context3(deCodec);
	if (!deCodecCtx) {
		fprintf(stderr, "Failed to allocate the decoder context for stream\n");
		return -1;
	}

	/**********************************************************************/
	/***************************** 给解码器赋予参数 *************************/
	// 不同的输入需要不同的解码器来解码，这里赋予其参数后，它才能对这个流进行解码
	ret = avcodec_parameters_to_context(deCodecCtx, ifmt_ctx->streams[video_stream_index]->codecpar);
	if (ret < 0) {
		fprintf(stderr, "Failed to copy decoder parameters to input decoder context for stream\n");
		return -1;
	}

	/**********************************************************************/
	/******************************* 打开解码器 ****************************/
	ret = avcodec_open2(deCodecCtx, deCodec, NULL);
	if (ret < 0) {
		fprintf(stderr, "Can not open decodec\n");
		return -1;
	}

	/**********************************************************************/
	/******************************** 分配pkt *****************************/
	dePkt = av_packet_alloc();
	if (!dePkt) {
		fprintf(stderr, "Could not allocate pkt\n");
		return -1;
	}
	
	/**********************************************************************/
	/******************************** 分配frame ***************************/
	deFrame = av_frame_alloc();
	if (!deFrame) {
		fprintf(stderr, "Could not allocate frame\n");
		return -1;
	}

	// 保存长宽到全局
	inWidth = deCodecCtx->width;
	inHeight = deCodecCtx->height;

	return 0;
}

// 初始化编码相关变量
int InitEncodeParam() {
	int ret;

	/**********************************************************************/
	/*********************** 分配输出 AVFormatContext **********************/
	ret = avformat_alloc_output_context2(&ofmt_ctx, NULL, "image2", NULL);
	if (ret < 0) {
		fprintf(stderr, "Alloc output context failed.\n");
		return -1;
	}

	ofmt = ofmt_ctx->oformat;
	ofmt->flags = ofmt->flags & ~AVFMT_NOFILE; // 置 AVFMT_NOFILE 置为零

    /**********************************************************************/
	/************************ 构建一个新stream，用于输出 ********************/
	ostream = avformat_new_stream(ofmt_ctx, NULL);
	if (ostream == NULL) {
		fprintf(stderr, "Frame2JPG:: avformat_new_stream error.\n");
		return -1;
	}
	ostream->id = ofmt_ctx->nb_streams - 1;

	/**********************************************************************/
    /***************************** 查找编码器 ******************************/
	enCodec = avcodec_find_encoder(ofmt->video_codec);
	if (!enCodec) {
		fprintf(stderr, "avcodec_find_encoder() error.\n");
		return -1;
	}

	/**********************************************************************/
    /************************* 分配编码器context ***************************/
	enCodecCtx = avcodec_alloc_context3(enCodec);
	if (!enCodecCtx) {
		fprintf(stderr, "Could not allocate video codec context\n");
		exit(1);
	}
	enCodecCtx->codec_id = ofmt->video_codec;
	enCodecCtx->codec_type = AVMEDIA_TYPE_VIDEO;
	enCodecCtx->pix_fmt = AV_PIX_FMT_YUVJ420P;
	enCodecCtx->color_range = AVCOL_RANGE_JPEG;
	enCodecCtx->width = outWidth;
	enCodecCtx->height = outHeight;
	enCodecCtx->time_base.num = 1;
	enCodecCtx->time_base.den = 25;

	/**********************************************************************/
    /***************************** 打开编码器 ******************************/
	ret = avcodec_open2(enCodecCtx, enCodec, NULL);
	if (ret < 0) {
		fprintf(stderr, "Could not open codec.\n");
		return ret;
	}

	/**********************************************************************/
    /*************************** 分配stream参数 ****************************/
	// 将enCodecCtx中的参数复制到 ostream 中，否则这个stream不知道自己是个啥
	ret = avcodec_parameters_from_context(ostream->codecpar, enCodecCtx);
	if (ret < 0) {
		fprintf(stderr, "Could not copy the stream parameters\n");
		return ret;
	}

	/**********************************************************************/
	/*************************** 分配编码使用的PKT *************************/
	enPkt = av_packet_alloc();
	if (!enPkt) {
		fprintf(stderr, "Could not allocate pkt\n");
		return -1;
	}

	/**********************************************************************/
	/************************** 分配编码使用的frame ************************/
	// 这个frame用于将解码得到的frame进行格式转换，才能继续送入编码器，否则出来的图片看不懂
	enframe = av_frame_alloc();
	if (!enframe) {
		fprintf(stderr, "Could not allocate frame\n");
		return -1;
	}

	enframe->format = enCodecCtx->pix_fmt;
	enframe->color_range = enCodecCtx->color_range;
	enframe->width = enCodecCtx->width;
	enframe->height = enCodecCtx->height;

	// 给frame实际分配buffer
	ret = av_frame_get_buffer(enframe, 0);
	if (ret < 0) {
		fprintf(stderr, "Could not allocate frame data.\n");
		return ret;
	}

	// Ensure that the frame data is writable, avoiding data copy if possible.
	// 避免数据复制占用内存，实际不知道有没有用。
	ret = av_frame_make_writable(enframe);
	if (ret < 0)
		return ret;

	return 0;
}

// 帧变换context，初始化一次就够了
void allocateSws() {
	if (!sws_ctx) {
		sws_ctx = sws_getContext(deFrame->width, deFrame->height,
			deFrame->format,
			enCodecCtx->width, enCodecCtx->height,
			enCodecCtx->pix_fmt,
			SWS_FAST_BILINEAR, NULL, NULL, NULL);
	}
}

// 将一个deFrame转换成一张 JPEG 图片
int Frame2JPG(char* out_file) {
	int ret;

	/**********************************************************************/
    /**************************** 打开输出文件 *****************************/
	// pb 是一个 AVIOContext ，这里应该是打开 outfilename ，将句柄存入pb中
	// 这样每次转换frame到jpg不需要重新分配编码器，只需要每次打开新文件即可
	ret = avio_open(&ofmt_ctx->pb, out_file, AVIO_FLAG_WRITE);
	if (ret < 0) {
		fprintf(stderr, "Could not open output file '%s'\n", out_file);
		goto end;
	}

	// 打印输出文件信息
	printf("--------------------------------------\n");
	av_dump_format(ofmt_ctx, 0, out_file, 1);
	printf("--------------------------------------\n");

	/**********************************************************************/
	/************************* 写入头部信息到文件 ***************************/
	ret = avformat_write_header(ofmt_ctx, NULL);
	if (ret < 0) {
		fprintf(stderr, "Write head to output file error.\n");
		goto end;
	}

	/**********************************************************************/
	/******************* 将deFrame变换写入到enFrame中 **********************/
	allocateSws();
	if (!sws_ctx) {
		fprintf(stderr, "Could not initialize the conversion context\n");
		goto end;
	}
	sws_scale(sws_ctx, (const uint8_t* const*)deFrame->data,
		deFrame->linesize, 0, deFrame->height, enframe->data,
		enframe->linesize);
	enframe->pts = pts++;

	/**********************************************************************/
    /************************** enFrame送入编码器 **************************/
	// 这里完成对enFrame的编码，形成pkt，并写入到输出文件
	ret = avcodec_send_frame(enCodecCtx, enframe);
	if (ret < 0) {
		fprintf(stderr, "Send IFrame to decodec error.\n");
		goto end;
	}
	while (ret >= 0) {
		ret = avcodec_receive_packet(enCodecCtx, dePkt);
		if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
			ret = 1;
			break;
		} else if (ret < 0) {
			fprintf(stderr, "Error during encoding\n");
			goto end;
		}
		printf("--------------------------------------\n");
		fprintf(stdout, "Write to picture [num = %d] [name = %s] [data size = %5d byte]\n", 
			picture_count, out_file, dePkt->size);
		printf("--------------------------------------\n");
		ret = av_write_frame(ofmt_ctx, dePkt);
		if (ret < 0) {
			fprintf(stderr, "Write frame to output failed.\n");
			goto end;
		}
		av_packet_unref(dePkt);
	}

	/**********************************************************************/
	/************************* 写入尾部信息到文件中 *************************/
	av_write_trailer(ofmt_ctx);

end:

	if (ret >= 0) {
		printf("Encoding picture success.\n");
	} else {
		printf("Encoding picture Failed.\n");
	}

	/**********************************************************************/
	/****************************** 关闭输出文件 ***************************/
	avio_closep(&ofmt_ctx->pb);

	return ret;
}

// 构建输出文件名，例如 test0.jpg  test1.jpg  test2.jpg
void constuctFileName() {
	memset(nameBuffer, 0, sizeof(nameBuffer));
	memset(intBuffer, 0, sizeof(intBuffer));
	itoa(out_num, intBuffer, 10);
	strcat(nameBuffer, outfilename_head);
	strcat(nameBuffer, intBuffer);
	strcat(nameBuffer, ".jpg");
	out_num = (out_num + 1) % 10;
}

void closeDecodeParam() {
	av_frame_free(&deFrame);
	av_packet_free(&dePkt);
	avcodec_free_context(&deCodecCtx);
	avformat_free_context(ifmt_ctx);
	video_stream_index = 0;
}

void closeEncodeParam() {
	av_frame_free(&enframe);
	av_packet_free(&enPkt);
	avcodec_free_context(&enCodecCtx);
	avformat_free_context(ofmt_ctx);
	sws_freeContext(sws_ctx);
}

int main(int argc, char** argv) {
	if (argc < 3) {
		fprintf(stderr, "Usage: %s <input file> <output file>\n", argv[0]);
		exit(0);
	}

	infilename = argv[1];
	outfilename_head = argv[2];

	printf("------------- begin -------------\n");

	int ret;

	// InitDecodeParam
	ret = InitDecodeParam();
	if (ret < 0) {
		goto end;
	}

	// 设置输出的宽高
	outWidth = inWidth;
	outHeight = inHeight;

	// InitEncodeParam
	ret = InitEncodeParam();
	if (ret < 0) {
		goto end;
	}

	/**********************************************************************/
	/****************************** 关闭输出文件 ***************************/
	// 1. 每次只将关键帧送入解码器，使用 dePkt->flags 来判断是否关键帧，可以减少解码CPU消耗。
	// 2. 每过一定时间间隔截取一张图片，这里通过 dePkt->pts 来判断包的时间。
	while (picture_count < picture_num && av_read_frame(ifmt_ctx, dePkt) >= 0) {
		pkt_count++;
		// 判断pkt属于视频流，并且是关键帧，这样就不用一直解包
		if (dePkt->stream_index == video_stream_index && (dePkt->flags & AV_PKT_FLAG_KEY)) {
			double timeStamp = dePkt->pts * av_q2d(ifmt_ctx->streams[video_stream_index]->time_base);
			if (timeStamp < (double)time_interval * key_pkt_count) {
				continue;
			}
			key_pkt_count++;

			fprintf(stdout, "Send an key packet to decodec. timeStamp: %.2f s.\n", timeStamp);

			// 将这个pkt解码
			ret = avcodec_send_packet(deCodecCtx, dePkt);
			if (ret < 0) {
				fprintf(stderr, "Error sending a packet for decoding\n");
				goto end;
			}
			// 将道理，对于视频来说，一个pkt应该只对应一个frame
			while (picture_count < picture_num && ret >= 0) {
				ret = avcodec_receive_frame(deCodecCtx, deFrame);
				if (ret == AVERROR(EAGAIN) || ret == AVERROR_EOF) {
					ret = 1;
					break;
				} else if (ret < 0) {
					fprintf(stderr, "Error during decoding\n");
					goto end;
				}
				if (deFrame->pict_type == AV_PICTURE_TYPE_I) { // 确定是关键帧，讲道理必然是
					constuctFileName();

					ret = Frame2JPG(nameBuffer);
					if (ret < 0) {
						fprintf(stderr, "Trancode to jpeg failed.\n");
						goto end;
					}
					picture_count++;
				} else {
					fprintf(stdout, "Warning: Not key frame.\n");
				}
			}
		}
		av_packet_unref(dePkt);
	}

end:

	if (ret < 0) {
		fprintf(stderr, "Error occurred: %s\n", av_err2str(ret));
	} else {
		fprintf(stdout, "SUCCESS\n");
	}

	printf("--------------------------------------\n");
	printf("pkt_count = %d, key_pkt_count = %d, picture_count = %d\n", pkt_count, key_pkt_count, picture_count);
	printf("--------------------------------------\n");

	// 释放资源
	closeDecodeParam();
	closeEncodeParam();

	return 0;
}

对于持续的在线截图来说，能降低CPU使用就尽量降低CPU使用，所以上面代码里面编解码器都是复用的，每次只是打开新的输出文件来写入图片而已。

但是这里存在一个问题，在调用avcodec_send_packet()函数的时候，你会发现必须要送入3个Packet，才会解码得到第一个frame， 这是解码器的缓存机制造成的，这个问题意味着：

第一张截图的获得需要经过2个interval，每一张截图的都会延迟2个interval。
最后2张截图没法获取。

这个问题最好就是解码器能够每次立即对一个Packet进行解码，但是貌似做不到。下面是avcodec_send_packet()代码中的注释：

* @param[in] avpkt The input AVPacket. Usually, this will be a single video
*                  frame, or several complete audio frames.
*                  Ownership of the packet remains with the caller, and the
*                  decoder will not write to the packet. The decoder may create
*                  a reference to the packet data (or copy it if the packet is
*                  not reference-counted).
*                  Unlike with older APIs, the packet is always fully consumed,
*                  and if it contains multiple frames (e.g. some audio codecs),
*                  will require you to call avcodec_receive_frame() multiple
*                  times afterwards before you can send a new packet.
*                  It can be NULL (or an AVPacket with data set to NULL and
*                  size set to 0); in this case, it is considered a flush
*                  packet, which signals the end of the stream. Sending the
*                  first flush packet will return success. Subsequent ones are
*                  unnecessary and will return AVERROR_EOF. If the decoder
*                  still has frames buffered, it will return them after sending
*                  a flush packet.

所以要是想截图的生产没有延迟，那就需要：

在发送一个Packet到解码器之后，再发送一个NULL的AVPacket。
解码器不能复用，每次需要重新new。这个应该也不会对CPU有什么消耗。

解决这些问题，就可以愉快的进行固定时间间隔的截图了。

2020-05-07 更新

今天测试了一下内存占用的情况，首先这个代码是没有内存泄漏的，我跑了几个小时试过，内存占用没有增加。

下面测试的时候使用的是1Mbps的流（或者flv文件）来测试的，通过几个阶段来测试内存占用：

只加载代码，也就是在代码第一行就阻塞了。
通过输入流信息（或者文件），初始化编解码器，然后阻塞。
正常拉流截图。

大致结果如下（windows10）：

大致可以看得出来，在编解码实际运行之前，占用内存并不多。（这里没有更加深入的分析。。。。）

参考