Linux 性能优化之使用 Tuned 配置多场景优化方案

2023-11-27 12:56:04 浏览数 (1)

1写在前面


  • 考试整理相关笔记
  • 博文内容涉及 Linux tuned 调优工具的简单认知
  • 调优配置文件的简单说明,自定义调优方案介绍
  • 理解不足小伙伴帮忙指正

对每个人而言,真正的职责只有一个:找到自我。然后在心中坚守其一生,全心全意,永不停息。所有其它的路都是不完整的,是人的逃避方式,是对大众理想的懦弱回归,是随波逐流,是对内心的恐惧 ——赫尔曼·黑塞《德米安》


2tuned 性能调优配置

tunedRed Hat 开发和维护的一个用于系统性能优化的工具,它可以根据不同的工作负载和需求,动态地调整系统的各种参数和配置,以提供最佳的性能和效能。

官网是这样介绍它:

  • 使用 udev 设备管理器监控连接的设备
  • 根据选定的配置文件调整系统设置
  • 支持各种类型的配置,如、sysctlsysfs 或内核引导命令行参数,这些参数是集成的 在插件体系结构
  • 支持设备的热插拔,可通过以下方式进行控制 通过命令行或通过 D-Bus,因此可以轻松集成 到现有的管理解决方案中:例如,使用 Cockpit
  • 可以在无守护程序模式下运行,但功能有限
  • 将其所有配置干净地存储在一个地方 - 在 TuneD 配置文件 – 而不是在多个地方进行配置 和自定义脚本

可以通过 tuned 的 profile 根据不同的应用案例对系统进行优化,tuned 的核心是 profile

安装工具,配置自启动

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/proc/sys/fs]
└─$yum -y install  tuned
Updating Subscription Management repositories.
Unable to read consumer identity
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Last metadata expiration check: 1:07:01 ago on Sun 01 Oct 2023 03:03:42 PM CST.
Package tuned-2.13.0-6.el8.noarch is already installed.
Dependencies resolved.
Nothing to do.
Complete!
┌──[root@liruilongs.github.io]-[/etc/tuned]
└─$systemctl enable tuned --now

tuned 软件提供了很多个调优方案,这些方案根据不同的目标调节内核参数,如省电方案、高网络吞吐方案、低延迟方案等。管理员也可以根据自己的需求自定义调优方案。

激活指定的配置文件

代码语言:javascript复制
tuned-adm profile <profile_name>
┌──[root@liruilongs.github.io]-[/etc/tuned]
└─$tuned-adm profile throughput-performance

查看激活的配置文件(调优方案)

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/etc/tuned]
└─$tuned-adm active
Current active profile: virtual-guest

/etc/tuned/目录下包含有active_profile文件,该文件记录当前激活的调优方案

tuned-adm list 可以列出可用的配置文件列表,以及当前激活的配置文件

可以看到,当前激活的配置文件为 Current active profile: virtual-guest,virtual-guest

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/proc/sys/fs]
└─$tuned-adm  list
Available profiles:
- accelerator-performance     - Throughput performance based tuning with disabled higher latency STOP states
- balanced                    - General non-specialized tuned profile
- desktop                     - Optimize for the desktop use-case
- hpc-compute                 - Optimize for HPC compute workloads
- intel-sst                   - Configure for Intel Speed Select Base Frequency
- latency-performance         - Optimize for deterministic performance at the cost of increased power consumption
- network-latency             - Optimize for deterministic performance at the cost of increased power consumption, focused on low latency network performance
- network-throughput          - Optimize for streaming network throughput, generally only necessary on older CPUs or 40G  networks
- powersave                   - Optimize for low power consumption
- throughput-performance      - Broadly applicable tuning that provides excellent performance across a variety of common server workloads
- virtual-guest               - Optimize for running inside a virtual guest
- virtual-host                - Optimize for running KVM guests
Current active profile: virtual-guest
┌──[root@liruilongs.github.io]-[/proc/sys/fs]
└─$

简单描述:

  • accelerator-performance:基于吞吐量性能的调整,禁用了更高延迟的 STOP 状态,加速模式
  • balanced:通用的非特殊化 Tuned 配置文件。均衡模式
  • desktop:针对桌面使用场景进行优化。交互式应用的响应速度更快
  • hpc-compute:针对 HPC 计算工作负载进行优化。高性能计算
  • intel-sst:为 Intel Speed Select Base Frequency 进行配置。
  • latency-performance:以牺牲功耗为代价,优化确定性性能,适合低延迟的需求
  • network-latency:以牺牲功耗为代价,专注于低延迟网络性能的优化。
  • network-throughput:针对流媒体网络吞吐量进行优化,通常仅适用于旧的 CPU 或 40G 网络。
  • powersave:优化功耗,实现低功耗消耗,省电模式
  • throughput-performance:最大吞吐量,提供磁盘和网络 IO 的吞吐量
  • virtual-guest:为运行在虚拟机客户机中进行优化,即虚拟机模式
  • virtual-host:为运行 KVM 客户机进行优化,运行了虚拟机的宿主机

tuned-adm recommend 用于查看系统推荐的优化方法

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/etc/tuned]
└─$tuned-adm recommend
virtual-guest

tuned-adm off 用于关闭调优方案

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/etc/tuned]
└─$tuned-adm off

新版本有更多的调优策略,提供了更多的调优场景

代码语言:javascript复制
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[/sys/kernel/cgroup]
└─$ tuned-adm --version
tuned-adm 2.15.0
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[/sys/kernel/cgroup]
└─$ tuned-adm recommend
balanced

默认的调优策略为 balanced,下面为提供的调优策略

代码语言:javascript复制
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[/sys/kernel/cgroup]
└─$ tuned-adm list
Available profiles:
- accelerator-performance     - Throughput performance based tuning with disabled higher latency STOP states
- atomic-guest                - Optimize virtual guests based on the Atomic variant
- atomic-host                 - Optimize bare metal systems running the Atomic variant
- balanced                    - General non-specialized tuned profile
- cpu-partitioning            - Optimize for CPU partitioning
- default                     - Legacy default tuned profile
- desktop                     - Optimize for the desktop use-case
- desktop-powersave           - Optmize for the desktop use-case with power saving
- enterprise-storage          - Legacy profile for RHEL6, for RHEL7, please use throughput-performance profile
- hpc-compute                 - Optimize for HPC compute workloads
- intel-sst                   - Configure for Intel Speed Select Base Frequency
- laptop-ac-powersave         - Optimize for laptop with power savings
- laptop-battery-powersave    - Optimize laptop profile with more aggressive power saving
- latency-performance         - Optimize for deterministic performance at the cost of increased power consumption
- mssql                       - Optimize for MS SQL Server
- network-latency             - Optimize for deterministic performance at the cost of increased power consumption, focused on low latency network performance
- network-throughput          - Optimize for streaming network throughput, generally only necessary on older CPUs or 40G  networks
- optimize-serial-console     - Optimize for serial console use.
- oracle                      - Optimize for Oracle RDBMS
- postgresql                  - Optimize for PostgreSQL server
- powersave                   - Optimize for low power consumption
- realtime                    - Optimize for realtime workloads
- realtime-virtual-guest      - Optimize for realtime workloads running within a KVM guest
- realtime-virtual-host       - Optimize for KVM guests running realtime workloads
- sap-hana                    - Optimize for SAP HANA
- sap-netweaver               - Optimize for SAP NetWeaver
- server-powersave            - Optimize for server power savings
- spectrumscale-ece           - Optimized for Spectrum Scale Erasure Code Edition Servers
- spindown-disk               - Optimize for power saving by spinning-down rotational disks
- throughput-performance      - Broadly applicable tuning that provides excellent performance across a variety of common server workloads
- virtual-guest               - Optimize for running inside a virtual guest
- virtual-host                - Optimize for running KVM guests
Current active profile: balanced

tuned 的调优策略是基于配置文件的,即每个调优策略有对应的配置文件,通过配置文件可以简单看看调优做了些什么操作

调优配置文件解析

这里我们简单分析一个一下 配置文件做了什么? 以 network-throughput 配置为例?

network-throughput调优策略主要针对流媒体网络吞吐量进行优化,通常仅适用于旧的 CPU 或 40G 网络。

tuned.conftuned 调优方案的配置文件,配置文件可用放在

  • /etc/tuned/<profile-name>/tuned.conf
  • /usr/lib/tuned/<profile-name/tuned.conf

其中 /etc/tuned/ 目录的优先级更高!

预设的调优配置文件在下面的目录下,每个调优方法都有自己的配置文件

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$ls
accelerator-performance  desktop    hpc-compute  latency-performance  network-throughput  recommend.d             virtual-guest
balanced                 functions  intel-sst    network-latency      powersave           throughput-performance  virtual-host
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$

可以看到预设的调优文件分为两部分,mainsysctl 部分

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$cat ./network-throughput/tuned.conf
#
# tuned configuration
#

[main]
summary=Optimize for streaming network throughput, generally only necessary on older CPUs or 40G  networks
include=throughput-performance

[sysctl]
# Increase kernel buffer size maximums.  Currently this seems only necessary at 40Gb speeds.
#
# The buffer tuning values below do not account for any potential hugepage allocation.
# Ensure that you do not oversubscribe system memory.
net.ipv4.tcp_rmem="4096 87380 16777216"
net.ipv4.tcp_wmem="4096 16384 16777216"
net.ipv4.udp_mem="3145728 4194304 16777216"
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$
  • mail 部分描述了 这个配置文件的作用,以及他引用了 "throughput-performance" 配置文件 include=throughput-performance
  • sysctl 部分描述了他修改的内核参数。这里主要修改了 TCP 和 UDP 相关的网络缓冲区大小。

来看一下引用的配置文件做了哪些调优

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$cat ./throughput-performance/tuned.conf | grep -v ^#

[main]
summary=Broadly applicable tuning that provides excellent performance across a variety of common server workloads

[cpu]
governor=performance
energy_perf_bias=performance
min_perf_pct=100

[disk]
readahead=>4096

[sysctl]
kernel.sched_min_granularity_ns = 10000000
kernel.sched_wakeup_granularity_ns = 15000000
vm.dirty_ratio = 40
vm.dirty_background_ratio = 10
vm.swappiness=10
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$

这里多了 cpu,disk 等配置块

  • [cpu] 部分包含了与CPU 相关的参数设置。governor=performance 设置 CPU 调频策略为性能模式,energy_perf_bias=performance 设置 CPU 能耗性能偏好为性能模式,min_perf_pct=100 设置 CPU 最小性能百分比为 100%。
  • [disk] 部分包含了与磁盘相关的参数设置。readahead=>4096 设置磁盘预读取大小为 4096。
  • [sysctl] 部分包含了一些内核参数的设置。例如,kernel.sched_min_granularity_ns 设置调度最小粒度的时间间隔为 10,000,000 纳秒,kernel.sched_wakeup_granularity_ns 设置调度唤醒粒度的时间间隔为 15,000,000 纳秒。此外,还有涉及到内存管理的参数设置,如 vm.dirty_ratio、vm.dirty_background_ratio 和 vm.swappiness。

关于 配置块的下文会详细介绍,除了上面不同的调优策略对应的配置文件, tuned 还提供了全局的配置文件,这个配置文件主要配置 tuned 服务相关的配置

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$cat /etc/tuned/tuned-main.conf  | grep -v ^#

daemon = 1 #指定是否使用守护进程模式
dynamic_tuning = 0 #指定是否启用动态调整功能。如果设置为 1,将启用动态调整功能,否则仅使用静态调整。
sleep_interval = 1
update_interval = 10
recommend_command = 1
reapply_sysctl = 1
default_instance_priority = 0
udev_buffer_size = 1MB
log_file_count = 2
log_file_max_size = 1MB
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$

当然,我们也可以定义自己的配置文件做优化

自定义调优方案

可以自定义调优方案, tuned 调优方法一般分为 动态静态两种

  • 静态调优:一次性应用预设好的内核参数(这里的内核参数主要指 /sys 或者 /proc 中的内核参数)
  • 动态调优:tunde 监控系统行为,并根据行为动态调优,优化内核参数,比如:有一个视频转码工作,当视频转码时不需要网络连接,此时可以降低网络接口速度以减少能耗,当视频转码完成后,网络接口数据长时间增加,就设置将接口设置为最大速度。

/etc/tuned/目录下包含tuned-main.conf 文件(全局配置文件),主要用在动态调优,一般自定义的调优方案也放在这个位置

自定义调优方案一般会复制一个已有的调优方案做改动

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$cp -r /usr/lib/tuned/throughput-performance/ /etc/tuned/liruilong-performance
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$cat /etc/tuned/liruilong-performance/tuned.conf
#
# tuned configuration
#

[main]
summary=Broadly applicable tuning that provides excellent performance across a variety of common server workloads

[cpu]
governor=performance
energy_perf_bias=performance
min_perf_pct=100

[disk]
# The default unit for readahead is KiB.  This can be adjusted to sectors
# by specifying the relevant suffix, eg. (readahead => 8192 s). There must
# be at least one space between the number and suffix (if suffix is specified).
readahead=>4096

[sysctl]
# ktune sysctl settings for rhel6 servers, maximizing i/o throughput
#
# Minimal preemption granularity for CPU-bound tasks:
# (default: 1 msec#  (1   ilog(ncpus)), units: nanoseconds)
kernel.sched_min_granularity_ns = 10000000

# SCHED_OTHER wake-up granularity.
# (default: 1 msec#  (1   ilog(ncpus)), units: nanoseconds)
#
# This option delays the preemption effects of decoupled workloads
# and reduces their over-scheduling. Synchronous workloads will still
# have immediate wakeup/sleep latencies.
kernel.sched_wakeup_granularity_ns = 15000000

# If a workload mostly uses anonymous memory and it hits this limit, the entire
# working set is buffered for I/O, and any more write buffering would require
# swapping, so it's time to throttle writes until I/O can catch up.  Workloads
# that mostly use file mappings may be able to use even higher values.
#
# The generator of dirty data starts writeback at this percentage (system default
# is 20%)
vm.dirty_ratio = 40

# Start background writeback (via writeback threads) at this percentage (system
# default is 10%)
vm.dirty_background_ratio = 10

# PID allocation wrap value.  When the kernel's next PID value
# reaches this value, it wraps back to a minimum PID value.
# PIDs of value pid_max or larger are not allocated.
#
# A suggested value for pid_max is 1024 * <# of cpu cores/threads in system>
# e.g., a box with 32 cpus, the default of 32768 is reasonable, for 64 cpus,
# 65536, for 4096 cpus, 4194304 (which is the upper limit possible).
#kernel.pid_max = 65536

# The swappiness parameter controls the tendency of the kernel to move
# processes out of physical memory and onto the swap disk.
# 0 tells the kernel to avoid swapping processes out of physical memory
# for as long as possible
# 100 tells the kernel to aggressively swap processes out of physical memory
# and move them to swap cache
vm.swappiness=10
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$

关于 tuned.conf 配置文件,这里做简单说明,可以 man 手册查看对应的信息

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$man tuned.conf

main

[main]是主配置,在 main 里面可用包含 include=语句,通过 include 语句可用加载特定的 profile,可用基于已经存在的 profile 创建一个新的 profile,如果两个 profile 定义了冲突的调优参数,则以新定义的参数为准。

Plugins 插件

其他的配置段被定义为插件,每个配置段都有一个名字,名称是插件的唯一标识,可以任意,不能重复,type= Plugin type. 插件类型,常见类型cpu,disk,mount,net,vm,sysctl,script,sysfs,scheduler,systemd.

如果使用插件名称和[NAME]一致,则 type 可以忽略,

查看当前支持的插件:

代码语言:javascript复制
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[~]
└─$ dpkg -L tuned  | grep plugin_
/usr/lib/python3/dist-packages/tuned/plugins/plugin_audio.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_bootloader.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_cpu.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_disk.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_eeepc_she.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_irqbalance.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_modules.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_mounts.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_net.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_rtentsk.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_scheduler.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_script.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_scsi_host.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_selinux.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_service.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_sysctl.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_sysfs.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_systemd.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_usb.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_video.py
/usr/lib/python3/dist-packages/tuned/plugins/plugin_vm.py
/usr/lib/python3/dist-packages/tuned/utils/plugin_loader.py
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[~]
└─$

查看插件支持的参数:

代码语言:javascript复制
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[~]
└─$ cat /usr/lib/python3/dist-packages/tuned/plugins/plugin_cpu.py | grep -A 10 config_options
        def _get_config_options(self):
                return {
                        "load_threshold"       : 0.2,
                        "latency_low"          : 100,
                        "latency_high"         : 1000,
                        "force_latency"        : None,
                        "governor"             : None,
                        "sampling_down_factor" : None,
                        "energy_perf_bias"     : None,
                        "min_perf_pct"         : None,
                        "max_perf_pct"         : None,
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[~]
└─$ cat /usr/lib/python3/dist-packages/tuned/plugins/plugin_disk.py | grep -A 10 config_options
        def _get_config_options(cls):
                return {
                        "dynamic"            : True, # FIXME: do we want this default?
                        "elevator"           : None,
                        "apm"                : None,
                        "spindown"           : None,
                        "readahead"          : None,
                        "readahead_multiply" : None,
                        "scheduler_quantum"  : None,
                }

--
        def _get_config_options_used_by_dynamic(cls):
                return [
                        "apm",
                        "spindown",
                ]

        def _instance_init(self, instance):
                instance._has_static_tuning = True

                self._apm_errcnt = 0
                self._spindown_errcnt = 0
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[~]
└─$

具体的参数配置需要结合实际的情况来处理,这里 sysctl 插件可以直接设置内核可调参数,script 插件可以引入自定义的调优脚本。但是需要结合 库函数使用,script 更多用于动态调优

看一个 Demo

vim tuned.conf

代码语言:javascript复制
[main]
# Includes plugins defined in "included" profile.
include=included
# Define my_sysctl plugin
[my_sysctl]
type=sysctl
# 256 KB default performs well experimentally.
net.core.rmem_default = 262144
net.core.wmem_default = 262144
[my_script]
type=script
script=${i:PROFILE_DIR}/profile.sh

script 插件支持执行特定的脚本,shell脚本位于/etc/tuned/profile-name目录下,脚本库 /usr/lib/tuned/functions 可以用于导出常用函数。

${i:PROFILE_DIR} 使⽤了可返回配置集tuned.conf ⽂件的位置的 PROFILE_DIR 内置函数,所以引用文件可以直接使用

脚本必须能够识别start和stop参数,当调优方案被激活时,start参数启用调优设置。当调优方案被禁用时,stop参数恢复所有前面应用的调优设置,可以参考/usr/lib/tuned/powersave/script.sh

代码语言:javascript复制
┌──[root@liruilongs.github.io]-[/usr/lib/tuned]
└─$cat /usr/lib/tuned/powersave/script.sh
#!/bin/sh

. /usr/lib/tuned/functions

start() {
    [ "$USB_AUTOSUSPEND" = 1 ] && enable_usb_autosuspend
    enable_wifi_powersave
    return 0
}

stop() {
    [ "$USB_AUTOSUSPEND" = 1 ] && disable_usb_autosuspend
    disable_wifi_powersave
    return 0
}

process $@ # 调用  process 函数,传递所有的参数

对应的 函数调用可以在 函数库找到。

代码语言:javascript复制
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[/usr/lib/python3/dist-packages/tuned/plugins]
└─$ cat /usr/lib/tuned/functions | grep process
# main processing
process() {
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[/usr/lib/python3/dist-packages/tuned/plugins]
└─$ cat /usr/lib/tuned/functions | grep enable
        # always patch cpuspeed configuration if exists, if it doesn't exist and is enabled,
# re-enable previous CPU governor settings
# enable multi core power savings for low wakeup systems
enable_cpu_multicore_powersave() {
THP_ENABLE="/sys/kernel/mm/transparent_hugepage/enabled"
[ -e "$THP_ENABLE" ] || THP_ENABLE="/sys/kernel/mm/redhat_transparent_hugepage/enabled"
enable_transparent_hugepages() {
        # 0    auto, PM enabled
enable_wifi_powersave() {
enable_bluetooth() {
enable_usb_autosuspend() {
enable_snd_ac97_powersave() {
enable_ksm()
┌──[root@hp-ProLiant-SL270s-Gen8-SE]-[/usr/lib/python3/dist-packages/tuned/plugins]
└─$

tuned-adm verify验证是否配置都生效,查看日志var/log/tuned/tuned.log

3博文部分内容参考

© 文中涉及参考链接内容版权归原作者所有,如有侵权请告知,这是一个开源项目,如果你认可它,不要吝啬星星哦 :)


https://tuned-project.org/

《 Red Hat Performance Tuning 442 》

https://www.cnblogs.com/fengyouyishuang/articles/17232438.html

0 人点赞