C++ 中文周刊 2024-05-04 第156期

2024-07-30 15:11:08 浏览数 (2)

本期文章由 HNY 赞助

资讯

标准委员会动态/ide/编译器信息放在这里

编译器信息最新动态推荐关注hellogcc公众号 本周更新 2024-05-01 第252期

文章

Mastering C with Google Benchmark https://ashvardanian.com/posts/google-benchmark/

教了一些google测试使用案例,这里直接贴一下吧,还算常用

分段采集

代码语言:javascript复制
static void sorting(bm::State &state) {

    auto count = static_cast<size_t>(state.range(0));
    auto include_preprocessing = static_cast<bool>(state.range(1));

    std::vector<int32_t> array(count);
    std::iota(array.begin(), array.end(), 1);

    for (auto _ : state) {

        if (!include_preprocessing)
            state.PauseTiming();

        // Reverse order is the most classical worst case,
        // but not the only one.
        std::reverse(array.begin(), array.end());
        if (!include_preprocessing)
            state.ResumeTiming();

        std::sort(array.begin(), array.end());
        bm::DoNotOptimize(array.size());
    }
}

BENCHMARK(sorting)->Args({3, false})->Args({3, true});
BENCHMARK(sorting)->Args({4, false})->Args({4, true});

不同策略压测

代码语言:javascript复制
template <typename execution_policy_t>
static void super_sort(bm::State &state, execution_policy_t &&policy) {

    auto count = static_cast<size_t>(state.range(0));
    std::vector<int32_t> array(count);
    std::iota(array.begin(), array.end(), 1);

    for (auto _ : state) {
        std::reverse(policy, array.begin(), array.end());
        std::sort(policy, array.begin(), array.end());
        bm::DoNotOptimize(array.size());
    }

    state.SetComplexityN(count);
    state.SetItemsProcessed(count * state.iterations());
    state.SetBytesProcessed(count * state.iterations() * sizeof(int32_t));
}
BENCHMARK_CAPTURE(super_sort, seq, std::execution::seq)
    ->RangeMultiplier(8)
    ->Range(1l << 20, 1l << 32)
    ->MinTime(10)
    ->Complexity(bm::oNLogN);

BENCHMARK_CAPTURE(super_sort, par_unseq, std::execution::par_unseq)
    ->RangeMultiplier(8)
    ->Range(1l << 20, 1l << 32)
    ->MinTime(10)
    ->Complexity(bm::oNLogN);

perf

代码语言:javascript复制
$ sudo ./build_release/tutorial --benchmark_perf_counters="CYCLES,INSTRUCTIONS"
$ sudo perf stat taskset 0xEFFFEFFFEFFFEFFFEFFFEFFFEFFFEFFF ./build_release/tutorial --benchmark_filter=super_sort

 Performance counter stats for 'taskset 0xEFFFEFFFEFFFEFFFEFFFEFFFEFFFEFFF ./build_release/tutorial --benchmark_filter=super_sort':

       23048674.55 msec task-clock                #   35.901 CPUs utilized          
           6627669      context-switches          #    0.288 K/sec                  
             75843      cpu-migrations            #    0.003 K/sec                  
         119085703      page-faults               #    0.005 M/sec                  
    91429892293048      cycles                    #    3.967 GHz                      (83.33%)
    13895432483288      stalled-cycles-frontend   #   15.20% frontend cycles idle     (83.33%)
     3277370121317      stalled-cycles-backend    #    3.58% backend cycles idle      (83.33%)
    16689799241313      instructions              #    0.18  insn per cycle         
                                                  #    0.83  stalled cycles per insn  (83.33%)
     3413731599819      branches                  #  148.110 M/sec                    (83.33%)
       11861890556      branch-misses             #    0.35% of all branches          (83.34%)

     642.008618457 seconds time elapsed

   21779.611381000 seconds user
    1244.984080000 seconds sys
Messing with lifetime https://biowpn.github.io/bioweapon/2024/05/03/messing-with-lifetime.html
代码语言:javascript复制
struct Point { int x, y; };

void foo(unsigned char* buf, size_t len) {
    assert(len == sizeof(Point));
    Point* p = reinterpret_cast<Point*>(buf);
    if (p->x == 0) {
        // ...
    }
}

这段代码是UB,主要问题是buf不能保证活着,所以通常有一种转移大法

代码语言:javascript复制
void foo(unsigned char* buf, size_t len) {
    assert(len == sizeof(Point));
    Point point;                    // a Point is created
    std::memcpy(&point, buf, len);
    if (point.x == 0) {             // Ok
        // ...
    }
}

怎么避免这种拷贝?就比如说我作为写代码的人我确认这段buf绝对是活的,start_lifetime_as

代码语言:javascript复制
void foo(unsigned char* buf, size_t len) {
    assert(len == sizeof(Point));
    Point* p = std::start_lifetime_as<Point>(buf);
    if (p->x == 0) {
        // ...
    }
}

我们再来回顾一下new和malloc的区别

new分配了空间(1)然后**构造了对象(2)**,malloc没有构造

实际上从构造的那一刻开始 new还生成了一个活的对象,开始了一个对象的生命周期 start lifetime(3)

那对于placement new来说,就没有 (1)

对于这种buffer转POD对象的场景,就缺少 (3)

我们不知道他是不是活的,这也就是 start_lifetime_as

我们再回到start_lifetime_asreinterpret_cast

实际上reinterpret_cast就是过于强了

对于只读的这么玩只要作者能保证buffer没问题UB也就UB了对付用

但是reinterpret_cast不能保证对应的buffer是不是活的,而从c来的语法,malloc就直接用了,根本就没有UB这一说

代码语言:javascript复制
struct X { int a, b; };
X *make_x() {
    X *p = (X*)malloc(sizeof(struct X));
    p->a = 1;  // before P0593: UB, no X lives at p
    p->b = 2;  // before P0593: UB, no X lives at p
    return p;
}

按照start_lifetime_as的设计,得这么写

代码语言:javascript复制
X *p = std::start_lifetime_as<X>( malloc(sizeof(struct X)) );

说了这么一大堆,也没有什么最佳解法。知道这套东西傻逼就好了,只能修修补补

C 极致的性能压榨:assume属性 https://zhuanlan.zhihu.com/p/695459367

看一乐

Understand internals of std::expected https://www.cppstories.com/2024/expected-cpp23-internals/

expected的大小?https://godbolt.org/z/xx3eWcP8x

代码语言:javascript复制
#include <iostream>
#include <typeinfo>
#include <expected>
#include <cxxabi.h>

template <typename T, typename U>
void printSizes() {
    int status = 0;
    char *realnameT = abi::__cxa_demangle(typeid(T).name(), 0, 0, &status);
    char *realnameU = abi::__cxa_demangle(typeid(U).name(), 0, 0, &status);
    
    std::cout << "nType: " << (realnameT ? realnameT : typeid(T).name()) << 'n';
    std::cout << "Size: " << sizeof(T) << 'n';
    
    std::cout << "Type: " << (realnameU ? realnameU : typeid(U).name()) << 'n';
    std::cout << "Size: " << sizeof(U) << 'n';
    
    std::cout << "Sizeof std::expected: ";              
    std::cout << sizeof(std::expected<T, U>) << 'n';
    
    free(realnameT);
    free(realnameU);
}


int main() {
    printSizes<int, std::string>();
    printSizes<int, int>();
    printSizes<int, double>();
    printSizes<int, std::pair<int, int>>();
}

/*
Type: int
Size: 4
Type: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >
Size: 32
Sizeof std::expected: 40

Type: int
Size: 4
Type: int
Size: 4
Sizeof std::expected: 8

Type: int
Size: 4
Type: double
Size: 8
Sizeof std::expected: 16

Type: int
Size: 4
Type: std::pair<int, int>
Size: 8
Sizeof std::expected: 12
*/
Daily bit(e) of C | Optimizing code to run 87x faster https://simontoth.substack.com/p/daily-bite-of-c-optimizing-code-to

这个是之前的解析文件挑战,大概就是读到hashmap然后聚合一下

作者从map到hash到多线程优化,比用stl实现快很多倍。看个乐。感觉这就是面试题

File modes in C 20 https://euroquis.nl//blabla/2024/04/30/chmod.html

直接贴godbolt和代码吧 https://godbolt.org/z/xc5MjTrob

代码语言:javascript复制
#include <iostream>
#include <sys/stat.h>

namespace detail
{
    // Tag classes to get suggestive type names into error messages
    struct invalid_permission_character {};
    struct invalid_permission_string_length {};

    // Check a single character in the string, returning
    // the mode_t for that bit-position. Peculiar name means
    // we get a suggestive error message
    template<int position, char accept>
    consteval mode_t expected_character_at_position(const char * const permission_string)
{
        const char c = permission_string[position];
        if(c == accept) { return 1 << (8-position); }
        if(c == '-') { return 0; }
        throw invalid_permission_character{};
    }

    // Worker function, assumes that it is called only by
    // from_readable_mode() or operator ""_mode, which have
    // done checking beforehand for the validity of s.
    //
    // It's wordy, but that does mean you get a suggestive
    // error message with the bit-position and expected character.
    consteval mode_t from_readable_mode_string(const char * const s)
{
    return
        detail::expected_character_at_position<0, 'r'>(s) |
        detail::expected_character_at_position<1, 'w'>(s) |
        detail::expected_character_at_position<2, 'x'>(s) |
        detail::expected_character_at_position<3, 'r'>(s) |
        detail::expected_character_at_position<4, 'w'>(s) |
        detail::expected_character_at_position<5, 'x'>(s) |
        detail::expected_character_at_position<6, 'r'>(s) |
        detail::expected_character_at_position<7, 'w'>(s) |
        detail::expected_character_at_position<8, 'x'>(s);
    }
}

// Turns a 9-character permission string like you would get
// from `ls -l` into the mode_t that it represents.
//
// (the [10] is because of the trailing NUL byte)
consteval mode_t from_readable_mode(const char (&permission_string)[10])
{
    return detail::from_readable_mode_string(permission_string);
}

// Turns a 9-character permission string like you would get
// from `ls -l` into the mode_t that it represents.
consteval mode_t operator""_mode(const char *s, size_t len)
{
    if (len != 9)
    {
        throw detail::invalid_permission_string_length{};
    }
    return detail::from_readable_mode_string(s);
}
// Shorthand for printing a line with a string literal
// and also getting the consteval value of the mode_t
// represented by that literal.
#define EXAMPLE(x) 
    std::cout << x << 't' << std::oct << from_readable_mode(x) << std::dec << 'n'

int main() {
    EXAMPLE("rw-rw----"); // 660
    EXAMPLE("rwxr-xr-x"); // 755
    EXAMPLE("r--------"); // 400
    EXAMPLE("-w-r---wx"); // 243??

    // These won't even compile, but show compile-time
    // bugs being caught.
    //
    // EXAMPLE("bug"); // Not a 9-character literal
    // EXAMPLE("rwxbadbug");

    // Using _mode to indicate what the string means
    // might be easier to read.
    std::cout << "rw-r--r--" << 't' << std::oct << "rw-r--r--"_mode << std::dec << 'n';

    // These won't even compile, for the same reasons
    //
    // std::cout << "birb"_mode << "------uwu"_mode;

    return 0;
}

视频

这里放个预告哈,cppcon视频出完了。后面准备更新cppcon2023了

开源项目介绍

  • • https://github.com/lhmouse/asteria一个脚本语言,可嵌入
  • • https://github.com/sebbbi/OffsetAllocator 一个O1内存实现,类似TLSF 这里标记个TODO有时间研究一下
  • • https://github.com/Jaysmito101/rusty.hpp 抄rust的特性实现
  • • 一个function实现 https://godbolt.org/z/9TT4846xo 只存指针。对于lambda应该不行,生命周期有问题,另外有五个指针,相当于模拟vtable

0 人点赞