如何使用yaraQA提升Yara规则的质量和性能

2023-09-18 19:53:04 浏览数 (1)

关于yaraQA

yaraQA是一款功能强大的Yara规则分析工具,在该工具的帮助下,广大研究人员可以轻松提升Yara规则的质量和性能。

很多Yara规则可能在语法上是正确的,但功能很可能仍然存在问题。而yaraQA则会试图找到这些问题并将其报告给YARA规则集的开发者或维护者。

yaraQA的功能

yaraQA会尝试检测下列问题:

1、语法正确,但由于条件中的错误,从而导致不匹配的规则; 2、使用可能错误的字符串和修饰符组合的规则(例如$ = "\Debug\" fullword); 3、由短原子、重复字符或循环引起的性能问题(例如$ = "AA"; 可以使用--ignore-performance从分析中排除);

工具安装

由于该工具基于Python 3开发,因此我们首先需要在本地设备上安装并配置好Python 3环境。接下来,广大研究人员可以使用下列命令将该项目源码克隆至本地:

代码语言:javascript复制
git clone https://github.com/Neo23x0/yaraQA.git

然后切换到项目目录中,使用pip工具和项目提供的requirements.txt文件安装该工具所需的其他依赖组件:

代码语言:javascript复制
cd yaraQA/

pip install -r requirements.txt

工具使用帮助

代码语言:javascript复制


usage: yaraQA.py [-h] [-f yara files [yara files ...]] [-d yara files [yara files ...]] [-o outfile] [-b baseline] [-l level]

                 [--ignore-performance] [--debug]

 

YARA RULE ANALYZER

 

optional arguments:

  -h, --help            显示工具帮助信息和退出

  -f yara files [yara files ...]

                        输入文件路径(一个或多个Yara规则,由空格分隔)

  -d yara files [yara files ...]

                        输入目录路径(Yara规则目录,由空格分隔)

  -o outfile          分析结果输出文件(JSON格式,默认为'yaraQA-issues.json')

  -b baseline          使用一个问题基线来过滤分析结果中的问题

  -l level               要显示的最低级别(1=基本信息, 2=警告, 3=严重)

  --ignore-performance   屏蔽与性能相关的规则问题

  --debug               调试模式输出

工具使用样例

代码语言:javascript复制


python3 yaraQA.py -d ./test/

屏蔽所有性能相关的问题,仅显示逻辑问题:

代码语言:javascript复制


python3 yaraQA.py -d ./test/ --ignore-performance

屏蔽所有信息性字符问题:

代码语言:javascript复制


python3 yaraQA.py -d ./test/ -level 2

使用一个基线,仅显示新的问题,基线文件需要是一个.json文件:

代码语言:javascript复制


python3 yaraQA.py -d ./test/ -b yaraQA-reviewed-issues.json

工具输出

yaraQA会将检测到的问题写入一个名为yaraQA-issues.json的文件中。

下面给出的是yaraQA生成的JSON格式结果:

代码语言:javascript复制
[

    {

        "rule": "Demo_Rule_1_Fullword_PDB",

        "id": "SM1",

        "issue": "The rule uses a PDB string with the modifier 'wide'. PDB strings are always included as ASCII strings. The 'wide' keyword is unneeded.",

        "element": {

            "name": "$s1",

            "value": "\\i386\\mimidrv.pdb",

            "type": "text",

            "modifiers": [

                "ascii",

                "wide",

                "fullword"

            ]

        },

        "level": "info",

        "type": "logic",

        "recommendation": "Remove the 'wide' modifier"

    },

    {

        "rule": "Demo_Rule_1_Fullword_PDB",

        "id": "SM2",

        "issue": "The rule uses a PDB string with the modifier 'fullword' but it starts with two backslashes and thus the modifier could lead to a dysfunctional rule.",

        "element": {

            "name": "$s1",

            "value": "\\i386\\mimidrv.pdb",

            "type": "text",

            "modifiers": [

                "ascii",

                "wide",

                "fullword"

            ]

        },

        "level": "warning",

        "type": "logic",

        "recommendation": "Remove the 'fullword' modifier"

    },

    {

        "rule": "Demo_Rule_2_Short_Atom",

        "id": "PA2",

        "issue": "The rule contains a string that turns out to be a very short atom, which could cause a reduced performance of the complete rule set or increased memory usage.",

        "element": {

            "name": "$s1",

            "value": "{ 01 02 03 }",

            "type": "byte"

        },

        "level": "warning",

        "type": "performance",

        "recommendation": "Try to avoid using such short atoms, by e.g. adding a few more bytes to the beginning or the end (e.g. add a binary 0 in front or a space after the string). Every additional byte helps."

    },

    {

        "rule": "Demo_Rule_3_Fullword_FilePath_Section",

        "id": "SM3",

        "issue": "The rule uses a string with the modifier 'fullword' but it starts and ends with two backslashes and thus the modifier could lead to a dysfunctional rule.",

        "element": {

            "name": "$s1",

            "value": "\\ZombieBoy\\",

            "type": "text",

            "modifiers": [

                "ascii",

                "fullword"

            ]

        },

        "level": "warning",

        "type": "logic",

        "recommendation": "Remove the 'fullword' modifier"

    },

    {

        "rule": "Demo_Rule_4_Condition_Never_Matches",

        "id": "CE1",

        "issue": "The rule uses a condition that will never match",

        "element": {

            "condition_segment": "2 of",

            "num_of_strings": 1

        },

        "level": "error",

        "type": "logic",

        "recommendation": "Fix the condition"

    },

    {

        "rule": "Demo_Rule_5_Condition_Short_String_At_Pos",

        "id": "PA1",

        "issue": "This rule looks for a short string at a particular position. A short string represents a short atom and could be rewritten to an expression using uint(x) at position.",

        "element": {

            "condition_segment": "$mz at 0",

            "string": "$mz",

            "value": "MZ"

        },

        "level": "warning",

        "type": "performance",

        "recommendation": ""

    },

    {

        "rule": "Demo_Rule_5_Condition_Short_String_At_Pos",

        "id": "PA2",

        "issue": "The rule contains a string that turns out to be a very short atom, which could cause a reduced performance of the complete rule set or increased memory usage.",

        "element": {

            "name": "$mz",

            "value": "MZ",

            "type": "text",

            "modifiers": [

                "ascii"

            ]

        },

        "level": "warning",

        "type": "performance",

        "recommendation": "Try to avoid using such short atoms, by e.g. adding a few more bytes to the beginning or the end (e.g. add a binary 0 in front or a space after the string). Every additional byte helps."

    },

    {

        "rule": "Demo_Rule_6_Condition_Short_Byte_At_Pos",

        "id": "PA1",

        "issue": "This rule looks for a short string at a particular position. A short string represents a short atom and could be rewritten to an expression using uint(x) at position.",

        "element": {

            "condition_segment": "$mz at 0",

            "string": "$mz",

            "value": "{ 4d 5a }"

        },

        "level": "warning",

        "type": "performance",

        "recommendation": ""

    },

    {

        "rule": "Demo_Rule_6_Condition_Short_Byte_At_Pos",

        "id": "PA2",

        "issue": "The rule contains a string that turns out to be a very short atom, which could cause a reduced performance of the complete rule set or increased memory usage.",

        "element": {

            "name": "$mz",

            "value": "{ 4d 5a }",

            "type": "byte"

        },

        "level": "warning",

        "type": "performance",

        "recommendation": "Try to avoid using such short atoms, by e.g. adding a few more bytes to the beginning or the end (e.g. add a binary 0 in front or a space after the string). Every additional byte helps."

    },

    {

        "rule": "Demo_Rule_6_Condition_Short_Byte_At_Pos",

        "id": "SM3",

        "issue": "The rule uses a string with the modifier 'fullword' but it starts and ends with two backslashes and thus the modifier could lead to a dysfunctional rule.",

        "element": {

            "name": "$s1",

            "value": "\\Section\\in\\Path\\",

            "type": "text",

            "modifiers": [

                "ascii",

                "fullword"

            ]

        },

        "level": "warning",

        "type": "logic",

        "recommendation": "Remove the 'fullword' modifier"

    }

]

包含问题的规则样例

项目专门提供了包含问题的规则样例,可以在./test目录中找到。

工具运行截图

许可证协议

本项目的开发与发布遵循GPL-3.0开源许可证协议。

项目地址

yaraQA:https://github.com/Neo23x0/yaraQA

0 人点赞