C++ 中文周刊 第121期

2024-07-30 14:13:41 浏览数 (2)

RSS https://github.com/wanghenshui/cppweeklynews/releases.atom

欢迎投稿,推荐或自荐文章/软件/资源等

请提交 issue[1]

本周内容不多

资讯

标准委员会动态/ide/编译器信息放在这里

编译器信息最新动态推荐关注hellogcc公众号 本周更新 2023-07-05 第209期

文章

  • • atomic工具:C 内存模型与现代硬件(一)[2]
  • • atomic工具:C 内存模型与现代硬件(二)[3]

虽然是老文了,对于清晰概念还是有一定的帮助的。讨论了很多基础概念,以及各种时序场景,我觉得都可以复述一遍这个演讲,增加自己理解

能讲给别人进步最快,各位可能都讲过题,要不要挑战一下

  • • Did you know that run-time dispatching over type-list can be implemented many different ways [4]
代码语言:javascript复制
template <template <class...> class TList, class TEvent, class... TEvents, class T, class TExpr>
constexpr auto dispatch(TList<TEvent, TEvents...>, const int id, const T& data,
                        const TExpr& expr) -> decltype(expr(TEvent{data})) {
    switch (id) {
        case TEvent::id:
            return expr(TEvent{data});

        default:
            if constexpr (sizeof...(TEvents) > 0) {
                return dispatch(TList<TEvents...>{}, id, data, expr);
            }
    }
    return {};
}
代码语言:javascript复制

不太懂有啥用

  • • Visiting a std::variant safely[5]

overload谁都会写,但是可能存在隐式转换,比如

代码语言:javascript复制
template<class... Ts>
struct overload : Ts... {
  using Ts::operator()...;
};

template<class... Ts>
overload(Ts...) -> overload<Ts...>;

int main() {
  const std::variant<int, bool> v{true};
  std::visit(
      overload{
          [](int val) { std::cout << val; },
          [](bool val) { std::cout << std::boolalpha << val; }, // 1
      },
      v);
}
代码语言:javascript复制

这里的int bool模糊,可能发生转换。如果把1这行注释掉,visit一样能遍历,走int哪个分支。发生转换了我超!

如何杀死这种异常?

代码语言:javascript复制
template<class...>
constexpr bool always_false_v = false;

template<class... Ts>
struct overload : Ts... {
  using Ts::operator()...;

  template<typename T>
  constexpr void operator()(T) const {
    static_assert(always_false_v<T>, "Unsupported type");
    // c  23 static_assert(false, "Unsupported type");
  }
};

template<class... Ts>
overload(Ts...) -> overload<Ts...>;

int main() {
  const std::variant<int, bool> v{true};

  std::visit(overload{
                 [](int val) { std::cout << val; },
                 [](bool val) { std::cout << std::boolalpha << val; },
             },
             v);
}
代码语言:javascript复制

正常走Ts的operator(),如果类型不匹配(隐式转换的前提是没有别的实现),最佳匹配是static_assert的那个实现,匹配中,编译报错

自己的overload都改一下,这个还是值得一用的

  • • Constrain your user-defined conversions[6]

这个问题其实和上面差不多,指针有隐式转换成bool的风险,可能一不小心写出bug

你比如有个类是这样的

代码语言:javascript复制
class string_literal {
public:
    operator const char*() const noexcept {
        return m_ptr;
    }
};
代码语言:javascript复制

问题来了,char*转bool

代码语言:javascript复制
int main() {
    string_literal str;

    if (str) {} //char* 转bool我超

    str   1; // char*取偏移我超
}
代码语言:javascript复制

这种代码,如果真的有bug被引入,都不想琢磨,太恶心了

怎么修?限制转换,强制约束类型,和上面的方案不谋而合

代码语言:javascript复制
class string_literal{
public:
    template <std::same_as<const char*> T>
    operator T() const noexcept
{
        return m_ptr;
    }
};
代码语言:javascript复制

c 20的concept感觉没有大规模推开,大家为了旧代码没升级,没用上

  • • Parsing time stamps faster with SIMD instructions[7]

代码在这里 https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2023/07/01/src/sse_date.c

解析时间,常规写法

代码语言:javascript复制
bool parse_time(const char *date_string, uint32_t *time_in_second) {
  const char *end = NULL;
  struct tm tm;
  if ((end = strptime(date_string, "%Y%m%d%H%M%S", &tm)) == NULL) {
    return false;
  }
  *time_in_second = (uint32_t)mktime_from_utc(&tm);
  return true;
}
代码语言:javascript复制

SSE代码我看不懂,直接贴了

代码语言:javascript复制
bool sse_parse_time(const char *date_string, uint32_t *time_in_second) {
  // We load the block of digits. We subtract 0x30 (the code point value of the
  // character '0'), and all bytes values should be between 0 and 9,
  // inclusively. We know that some character must be smaller that 9, for
  // example, we cannot have more than 59 seconds and never 60 seconds, in the
  // time stamp string. So one character must be between 0 and 5. Similarly, we
  // start the hours at 00 and end at 23, so one character must be between 0
  // and 2. We do a saturating subtraction of the maximum: the result of such a
  // subtraction should be zero if the value is no larger. We then use a special
  // instruction to multiply one byte by 10, and sum it up with the next byte,
  // getting a 16-bit value. We then repeat the same approach as before,
  // checking that the result is not too large.
  //
  __m128i v = _mm_loadu_si128((const __m128i *)date_string);
  // loaded YYYYMMDDHHmmSS.....
  v = _mm_xor_si128(v, _mm_set1_epi8(0x30));
  // W can use _mm_sub_epi8 or _mm_xor_si128 for the subtraction above.
  // subtracting by 0x30 (or '0'), turns all values into a byte value between 0
  // and 9 if the initial input was made of digits.
  __m128i limit =
      _mm_setr_epi8(9, 9, 9, 9, 1, 9, 3, 9, 2, 9, 5, 9, 5, 9, -1, -1);
  // credit @aqrit
  // overflows are still possible, if hours are in the range 24 to 29
  // of if days are in the range 32 to 39
  // or if months are in the range 12 to 19.
  __m128i abide_by_limits = _mm_subs_epu8(v, limit); // must be all zero

  __m128i byteflip = _mm_setr_epi64((__m64)0x0607040502030001ULL,
                                    (__m64)0x0e0f0c0d0a0b0809ULL);

  __m128i little_endian = _mm_shuffle_epi8(v, byteflip);
  __m128i limit16 = _mm_setr_epi16(0x0909, 0x0909, 0x0102, 0x0301, 0x0203,
                                   0x0509, 0x0509, -1);
  __m128i abide_by_limits16 =
      _mm_subs_epu16(little_endian, limit16); // must be all zero

  __m128i combined_limits =
      _mm_or_si128(abide_by_limits16, abide_by_limits); // must be all zero
  // We want to disallow 0s for days and months... and we want to make
  // sure that we don't go back in time prior to 1900.
  __m128i limit16_low = _mm_setr_epi16(0x0109, 0, 0x0001, 0x0001, 0, 0, 0, 0);

  __m128i abide_by_limits16_low =
      _mm_subs_epu16(limit16_low, little_endian); // must be all zero
  combined_limits = _mm_or_si128(combined_limits, abide_by_limits16_low);

  if (!_mm_test_all_zeros(combined_limits, combined_limits)) {
    return false;
  }
  // 0x000000SS0mmm0HHH`00DD00MM00YY00YY
  //////////////////////////////////////////////////////
  // pmaddubsw has a high latency (e.g., 5 cycles) and is
  // likely a performance bottleneck.
  /////////////////////////////////////////////////////
  const __m128i weights = _mm_setr_epi8(
      //     Y   Y   Y   Y   m   m   d   d   H   H   M   M   S   S   -   -
      10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 0, 0);
  v = _mm_maddubs_epi16(v, weights);

  uint64_t hi = (uint64_t)_mm_extract_epi64(v, 1);
  uint64_t seconds = (hi * 0x0384000F00004000) >> 46;
  uint64_t lo = (uint64_t)_mm_extract_epi64(v, 0);
  uint64_t yr = (lo * 0x64000100000000) >> 48;

  // We checked above that dy and mo are >= 1
  uint64_t mo = ((lo >> 32) & 0xff) - 1;
  uint64_t dy = (uint64_t)_mm_extract_epi8(v, 6);

  bool is_leap_yr = is_leap_year((int)yr);

  if (dy > (uint64_t)mdays[mo]) { // unlikely branch
    if (mo == 1 && is_leap_yr) {
      if (dy != 29) {
        return false;
      }
    } else {
      return false;
    }
  }
  uint64_t days = 365 * (yr - 1970)   (uint64_t)leap_days(1970, (int)yr);

  days  = (uint64_t)mdays_cumulative[mo];
  days  = is_leap_yr & (mo > 1);

  days  = dy - 1;
  uint64_t time_in_second64 = seconds   days * 60 * 60 * 24;
  *time_in_second = (uint32_t)time_in_second64;
  return time_in_second64 == (uint32_t)time_in_second64;
}


static const int mdays_minus_one[] = {30, 27, 30, 29, 30, 29, 30, 30, 29, 30, 29, 30};

// uses more instructions than sse_parse_time but might be slightly faster.
bool sse_parse_time_alt(const char *date_string, uint32_t *time_in_second) {
  // We load the block of digits. We subtract 0x30 (the code point value of the
  // character '0'), and all bytes values should be between 0 and 9,
  // inclusively. We know that some character must be smaller that 9, for
  // example, we cannot have more than 59 seconds and never 60 seconds, in the
  // time stamp string. So one character must be between 0 and 5. Similarly, we
  // start the hours at 00 and end at 23, so one character must be between 0
  // and 2. We do a saturating subtraction of the maximum: the result of such a
  // subtraction should be zero if the value is no larger. We then use a special
  // instruction to multiply one byte by 10, and sum it up with the next byte,
  // getting a 16-bit value. We then repeat the same approach as before,
  // checking that the result is not too large.
  //
  // We compute the month the good old ways, as an integer in [0,11], we
  // check for overflows later.
  uint64_t mo = (uint64_t)((date_string[4]-0x30)*10   (date_string[5]-0x30) - 1);
  __m128i v = _mm_loadu_si128((const __m128i *)date_string);
  // loaded YYYYMMDDHHmmSS.....
  v = _mm_xor_si128(v, _mm_set1_epi8(0x30));
  // W can use _mm_sub_epi8 or _mm_xor_si128 for the subtraction above.
  // subtracting by 0x30 (or '0'), turns all values into a byte value between 0
  // and 9 if the initial input was made of digits.
  __m128i limit =
      _mm_setr_epi8(9, 9, 9, 9, 1, 9, 3, 9, 2, 9, 5, 9, 5, 9, -1, -1);
  // credit @aqrit
  // overflows are still possible, if hours are in the range 24 to 29
  // of if days are in the range 32 to 39
  // or if months are in the range 12 to 19.
  __m128i abide_by_limits = _mm_subs_epu8(v, limit); // must be all zero

  __m128i byteflip = _mm_setr_epi64((__m64)0x0607040502030001ULL,
                                    (__m64)0x0e0f0c0d0a0b0809ULL);

  __m128i little_endian = _mm_shuffle_epi8(v, byteflip);
  __m128i limit16 = _mm_setr_epi16(0x0909, 0x0909, 0x0102, 0x0301, 0x0203,
                                   0x0509, 0x0509, -1);
  __m128i abide_by_limits16 =
      _mm_subs_epu16(little_endian, limit16); // must be all zero

  __m128i combined_limits =
      _mm_or_si128(abide_by_limits16, abide_by_limits); // must be all zero

  if (!_mm_test_all_zeros(combined_limits, combined_limits)) {
    return false;
  }
  // 0x000000SS0mmm0HHH`00DD00MM00YY00YY
  //////////////////////////////////////////////////////
  // pmaddubsw has a high latency (e.g., 5 cycles) and is
  // likely a performance bottleneck.
  /////////////////////////////////////////////////////
  const __m128i weights = _mm_setr_epi8(
      //     Y   Y   Y   Y   m   m   d   d   H   H   M   M   S   S   -   -
      10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 10, 1, 0, 0);
  v = _mm_maddubs_epi16(v, weights);

  uint64_t hi = (uint64_t)_mm_extract_epi64(v, 1);
  uint64_t seconds = (hi * 0x0384000F00004000) >> 46;
  uint64_t lo = (uint64_t)_mm_extract_epi64(v, 0);
  uint64_t yr = (lo * 0x64000100000000) >> 48;

  // We compute the day (starting at zero). We implicitly 
  // check for overflows later.
  uint64_t dy = (uint64_t)_mm_extract_epi8(v, 6) - 1;

  bool is_leap_yr = is_leap_year((int)yr);
  if(mo > 11) { return false; } // unlikely branch
  if (dy > (uint64_t)mdays_minus_one[mo]) { // unlikely branch
    if (mo == 1 && is_leap_yr) {
      if (dy != 29 - 1) {
        return false;
      }
    } else {
      return false;
    }
  }
  uint64_t days = 365 * (yr - 1970)   (uint64_t)leap_days(1970, (int)yr);

  days  = (uint64_t)mdays_cumulative[mo];
  days  = is_leap_yr & (mo > 1);

  days  = dy;
  uint64_t time_in_second64 = seconds   days * 60 * 60 * 24;
  *time_in_second = (uint32_t)time_in_second64;
  return time_in_second64 == (uint32_t)time_in_second64;
}
代码语言:javascript复制

  • • Having fun with string literal suffixes in C [8]

还是介绍UDL

代码语言:javascript复制
#include <regex>
struct convenience_matcher {
  convenience_matcher(const char *str) : re(str) {}
  bool match(const std::string &s) {
    std::smatch base_match;
    return std::regex_match(s, base_match, re);
  }
  bool operator()(const std::string &s) { return match(s); }
  std::regex re;
};
 "\d "_re("123") // true
 "\d "_re("a23") // false
 R"(d )"_re("123") // true
 R"(d )"_re("a23") // false
代码语言:javascript复制

看个乐,真要需要这种需求不如在线正则网站。只是个例子教你怎么用UDL

  • • How “static storage for initializers” did at Varna[9]

他这个背景我没怎么看懂,这里标记个TOOD

  • • Notes on float and multi-byte delta compression [10]

看不懂,这里标记个TOOD

  • • Raymond Chen怎么这么能写啊我超,我一个没看
  • • The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, wrapping the awaitable[11]
  • • The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, preallocating the coroutine frame[12]
  • • The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, memory allocation failure[13]
  • • The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, symmetric transfer[14]
  • • The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, custom promise[15]

视频

  • • From Templates to Concepts: Metaprogramming in C - Alex Dathskovsky - CppNow 2023[16]

基础概念,没啥说的其实,就是怎么用concept

cppnow视频也放出来的若干个。最近尽可能看这个,和去年的演讲

  • • Reflect *this!: Design and Implementation of a Mixin Library with Static Reflection - Andy Soffer[17]

https://asoffer.github.io/reflect-this-presentation/cppnow-2023/

讲反射的。讨论了一些现有的设计,实现了自己的一种设计,组合一波,还算有点意思

开源项目需要人手

  • • asteria[18] 一个脚本语言,可嵌入,长期找人,希望胖友们帮帮忙,也可以加群384042845和作者对线
  • • Unilang[19] deepin的一个通用编程语言,点子有点意思,也缺人,感兴趣的可以github讨论区或者deepin论坛看一看。这里也挂着长期推荐了
  • • gcc-mcf[20] 懂的都懂

本文永久链接[21]

如果有疑问评论最好在上面链接到评论区里评论,这样方便搜索,微信公众号有点封闭/知乎吞评论

引用链接

[1] 提交 issue: https://github.com/wanghenshui/cppweeklynews/issues [2] atomic工具:C 内存模型与现代硬件(一): https://zhuanlan.zhihu.com/p/625860910 [3] atomic工具:C 内存模型与现代硬件(二): https://zhuanlan.zhihu.com/p/632622548 [4] Did you know that run-time dispatching over type-list can be implemented many different ways : https://github.com/QuantlabFinancial/cpp_tip_of_the_week/ [5] Visiting a std::variant safely: https://andreasfertig.blog/2023/07/visiting-a-stdvariant-safely/ [6] Constrain your user-defined conversions: https://www.foonathan.net/2023/07/constrain-user-defined-conversions/ [7] Parsing time stamps faster with SIMD instructions: https://lemire.me/blog/2023/07/01/parsing-time-stamps-faster-with-simd-instructions/ [8] Having fun with string literal suffixes in C : https://lemire.me/blog/2023/07/05/having-fun-with-string-literal-suffixes-in-c/ [9] How “static storage for initializers” did at Varna: https://quuxplusone.github.io/blog/2023/07/05/p2752-at-varna/ [10] Notes on float and multi-byte delta compression : http://cbloomrants.blogspot.com/2023/07/notes-on-float-and-multi-byte-delta.html [11] The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, wrapping the awaitable: https://devblogs.microsoft.com/oldnewthing/20230706-00/?p=108398 [12] The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, preallocating the coroutine frame: https://devblogs.microsoft.com/oldnewthing/20230705-00/?p=108392 [13] The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, memory allocation failure: https://devblogs.microsoft.com/oldnewthing/20230704-00/?p=108389 [14] The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, symmetric transfer: https://devblogs.microsoft.com/oldnewthing/20230703-00/?p=108387 [15] The Old New Thing - How to wait for multiple C coroutines to complete before propagating failure, custom promise: https://devblogs.microsoft.com/oldnewthing/20230630-00/?p=108382 [16] From Templates to Concepts: Metaprogramming in C - Alex Dathskovsky - CppNow 2023: https://www.youtube.com/watch?v=x6_o-jz_Q-8&ab_channel=CppNow [17] Reflect *this!: Design and Implementation of a Mixin Library with Static Reflection - Andy Soffer: https://www.youtube.com/watch?v=kFChd-RrSP8&t=171s&ab_channel=CppNow [18] asteria: https://github.com/lhmouse/asteria [19] Unilang: https://github.com/linuxdeepin/unilang [20] gcc-mcf: https://gcc-mcf.lhmouse.com/ [21] 本文永久链接: https://wanghenshui.github.io/cppweeklynews/posts/121.html

0 人点赞