FFmpeg4.0+SDL2.0笔记01：Making Screencaps

环境

背景：在系统性学习FFmpeg时，发现官方推荐教程还是15年的，不少接口已经弃用，大版本也升了一级，所以在这里记录下FFmpeg4.0 SDL2.0的学习过程。

win10,VS2019,FFmpeg4.3.2,SDL2.0.14

原文地址：http://dranger.com/ffmpeg/tutorial01.html

概述

多媒体文件有一些基本组成知识。

1、多媒体文件本身被称为容器，容器类型决定了文件内部的存储形式，比如AVI和Quicktime就是两种不同的容器。

2、多媒体文件中有多串stream（数据流），通常是一串视频流加一串音频流（stream可以理解为按时间轴获取的连续数据元素），流中的数据元素被称为frame（帧），比如常见的视频流由一串连续的H264数据帧组成。不同的流通过不通的codec（编解码器）编码生成。codec定义了真实数据是如何编码/解码的（CODEC==COded DECoded)，

3、从流中直接读取出来的叫packet（数据包），每个packet里包含一个或多个frame。

音视频处理流程可以概括为这几步：

代码语言：javascript复制

10 从 video.avi 中打开 videoStream
20 从 videoStream 中读取 packet 送给 ffmpeg 解码 
30 从 ffmpeg 获取拿解码完的 frame，如果frame还不完整，GOTO 20
40 用户根据自己的需求处理 frame，比如保存成文件，或者渲染成视频
50 GOTO 20

在这一章，我们将打开一个多媒体文件，读取里面的视频流，使用ffmpeg解码，然后把解码后的frame转换成RGB格式，最后保存到ppm文件里。

打开文件

在以前的版本，我们需要先使用av_register_all初始化ffmpeg，但4.0已弃用该接口，直接avformat_open_input打开文件即可。

代码语言：javascript复制

extern "C" {
#include <libavformat/avformat.h>
#include <libavcodec/avcodec.h>
#include <libswscale/swscale.h>
#include <libavutil/imgutils.h>
}

const string kMediaFile = "../../../assets/destiny.mp4";

int main(int argc, charg *argv[]) {
	//初始化不用啦，直接avformat_open_input即可
	//av_register_all();
	
	AVFormatContext* pFormatCtx = nullptr;
	//读取avformat
	if (avformat_open_input(&pFormatCtx, kMediaFile.c_str(), nullptr, nullptr) < 0) {
		cout << "avformat_open_input failed" << endl;
		return -1;
	}
}

avformat_open_input 读取文件头并把文件格式等信息保存在AVFormatContext中，后面两个参数分别为AVInputFormat指定视频格式，AVDictionary指定码流的各种参数，传空的话FFmpeg会自动检测。

接下来读取数据流信息：

代码语言：javascript复制

    //读取流信息，该函数会填充pFormatCtx->streams
    if (avformat_find_stream_info(pFormatCtx, nullptr) < 0) {
        cout << "avformat_find_stream_info failed" << endl;
        return -1;
    }
    //dump格式信息
    av_dump_format(pFormatCtx, 0, kMediaFile.c_str(), 0);

avformat_find_stream_info读取流信息并保存到pFormatCtx->streams，可通过av_dump_format 查看具体内容。

pFormatCtx->streams是一个指针数组，数组里的指针分别指向不同的流信息，长度为pFormatCtx->nb_streams，我们遍历来找到视频流的相关参数。

代码语言：javascript复制

    int iVideoStream = -1;
    AVCodecParameters* pVideoCodecPar = nullptr;
    //找到视频流编码信息
    for (unsigned i = 0; i < pFormatCtx->nb_streams;   i) {
        if (iVideoStream == -1 && pFormatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
            iVideoStream = i;
            break;
        }
    }
    if (iVideoStream == -1) {
        cout << "couldn't find video stream" << endl;
        return -1;
    }
    //AVStream.codec 被替换为 AVStream.codecpar
    //详见https://lists.libav.org/pipermail/libav-commits/2016-February/018031.html
    pVideoCodecPar = pFormatCtx->streams[iVideoStream]->codecpar;

AVCodecParameters包含视频流使用的codec的相关参数（原先的AVStream.codec已被弃用）。有了我们就能找到具体的解码器并打开它。

代码语言：javascript复制

        
	AVCodec* pVideoCodec = nullptr;
	AVCodecContext* pVideoCodecCtx = nullptr;
        //找到对应的decoder
        pVideoCodec = avcodec_find_decoder(pVideoCodecPar->codec_id);
        if (pVideoCodec == nullptr) {
            cout << "avcodec_find_decoder failed" << endl;
            return -1;
        }

        //avcodec_open2前必须使用avcodec_alloc_context3生成context
        pVideoCodecCtx = avcodec_alloc_context3(pVideoCodec);
        if (avcodec_parameters_to_context(pVideoCodecCtx, pVideoCodecPar) < 0) {
            cout << "avcodec_parameters_to_context failed" << endl;
            return -1;
        }

        //使用pVideoCodec初始化pVideoCodecCtx
        if (avcodec_open2(pVideoCodecCtx, pVideoCodec, nullptr) < 0) {
            cout << "avcodec_open2 failed" << endl;
            return -1;
        }

最后初始化AVCodecContext，即后续解码需要的上下文。

缓存数据

解码之前，我们需要一个载体来保存解码后的数据，在ffmpeg里，这个载体叫AVFrame，它必须由av_frame_alloc初始化。

代码语言：javascript复制

    AVFrame* pFrame = nullptr;
    pFrame=av_frame_alloc();

拿到解码数据后，我们需要把它从YUV格式转换为RGB格式，转换需要用到<swscale>。相应的，我们要初始化转换上下文sws_context，以及分配保存RGB数据的buffer。

代码语言：javascript复制

        SwsContext* pSwsCtx = nullptr;
        uint8_t* data[4] = { nullptr };
        int linesizes[4] = { 0 };
        
        //申请存储RGB的buffer
        //原先avpicture_get_size=>av_malloc=>avpicture_fill这套接口被弃用了
        //新接口 <libavutil/imgutils.h>::av_image_alloc
        if (av_image_alloc(data, linesizes, pVideoCodecCtx->width, pVideoCodecCtx->height, AV_PIX_FMT_RGB24, 16) < 0) {
            cout << "av_image_alloc failed" << endl;
            return -1;
        }
        
        //YUV2RGB前需要初始化sws_context
        pSwsCtx = sws_getContext(pVideoCodecCtx->width, pVideoCodecCtx->height, pVideoCodecCtx->pix_fmt,
            pVideoCodecCtx->width, pVideoCodecCtx->height, AV_PIX_FMT_RGB24,
            SWS_BILINEAR, nullptr, nullptr, nullptr);

读取数据

下面是读取packet，解码，转换，保存的代码：

代码语言：javascript复制

    int count = 0;
    //读取视频包并解码，转换，保存文件
    while (av_read_frame(pFormatCtx, &packet) >= 0) {
        if (packet.stream_index == iVideoStream) {
            avcodec_send_packet(pVideoCodecCtx, &packet);
            if (avcodec_receive_frame(pVideoCodecCtx, pFrame) == 0) {

                //原先使用了pFrameRGB这个结构体作为data和linesizes的载体，这里弃用了
                sws_scale(pSwsCtx,
                    pFrame->data, pFrame->linesize, 0, pFrame->height,
                    data, linesizes);
                if (count   % 100 == 0) {
                    saveFrame(data, linesizes, pVideoCodecCtx->width, pVideoCodecCtx->height, count);
                    cout << "save frame" << count << endl;
                }
            }
            //取消packet引用的内存，原先的av_free_packet弃用
            av_packet_unref(&packet);
        }
    }

整个流程很简单：

av_read_frame 读取packet。注意packet使用完后需要我们调用av_packet_unref来取消packet的数据引用。

avcodec_send_packet将packet送给ffmpeg解码。

avcodec_receive_frame从ffmpeg拿解码后的数据。不过不是每一个packet送过去都能拿到解码后的数据，需要根据返回值判断。

sws_scale将YUV转换为RGB。

saveFrame是我们自己实现的接口，它会将RGB保存为PPM文件。

代码语言：javascript复制

const string kBaseDir = "../../../assets/";
void saveFrame(uint8_t** data, int* linesizes, int width, int height, int count) {
    ofstream ofs;
    string fileName;

    fileName = kBaseDir   to_string(count)   ".ppm";
    ofs.open(fileName, ios::binary);
    if (!ofs.is_open()) {
        cout << "can't open file " << fileName << endl;
        return;
    }

    //ppm header
    string header = "P6n"   to_string(width)   " "   to_string(height)   "n255n";
    ofs.write(header.c_str(), header.size());
    //wiite rgb data by line
    for (int i = 0; i < height;   i) {
        //对于RGB格式，linesize[0]即一行的字节数,有linesizes[0]==3*width
        ofs.write((const char*)data[0]   i * linesizes[0], linesizes[0]);
    }

    ofs.close();
}

完成！最后别忘了释放内存。

代码语言：javascript复制

    av_freep(data);
    av_free(pFrame);

    avcodec_close(pVideoCodecCtx);

    avformat_close_input(&pFormatCtx);

以上就是第一章的全部内容。

代码：https://github.com/onlyandonly/ffmpeg_sdl_player

视频处理 python 编程算法容器

0 人点赞