周刊项目地址[1]
RSS https://github.com/wanghenshui/cppweeklynews/releases.atom
欢迎投稿,推荐或自荐文章/软件/资源等
请提交 issue[2]
本周内容不多
资讯
标准委员会动态/ide/编译器信息放在这里
编译器信息最新动态推荐关注hellogcc公众号 本周更新 2023-07-19 第211期
文章
- • Inside boost::concurrent_flat_map[3]
boost 1.83会有个 boost::concurrent_flat_map
, 这篇文章带你了解设计思路。还是开地址法,还并发,还快
测了一下是吊锤tbb的。没测和folly::concurrentHashMap的比较
其实这个文章可以展开讲讲
- • Crash in js::StackUses in js::frontend::BytecodeSection::updateDepth(js::frontend::BytecodeOffset) (Samsung CPU Issue)[4]
浏览器安卓端崩溃最终怀疑是CPU的问题
- • Peeking under the hood of GCC's __builtin_expect[5]
总之就是利用CPU流水线先走一个快速路径,再检查条件,也就是所谓的分支预测
你可能想问了,我能不能让CPU别预测,别显得你多聪明了老老实实算就完了
__builtin_unpredictable() https://clang.llvm.org/docs/LanguageExtensions.html#builtin-unpredictable
- • 尝试实现一个Pipeline执行引擎[6]
基于async_simple的。有点意思
- • 分布式块存储性能调优之PGO[7]
了解一下PGO流程
- • 基于C 20无栈协程与protobuf的轻量级、高性能Rpc框架[8]
看个热闹
- • Fun with gRPC and C [9]
手把手教你写个grpc server
- • Fast decoding of base32 strings[10]
SIMD时间,字符转数字
常规, 一个一个比
代码语言:javascript复制if (ch >= '0' && ch <= '9')
d = ch - '0';
else if (ch >= 'A' && ch <= 'V')
d = ch - 'A' 10;
else if (ch >= 'a' && ch <= 'v')
d = ch - 'a' 10;
else
return -1;
进化版本,唉我会打表了
代码语言:javascript复制size_t base32hex_simple(uint8_t *dst, const uint8_t *src) {
static const uint8_t table[256] = {
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 32, 32, 32, 32, 32, 32,
32, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 31, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
};
bool valid = true;
const uint8_t *srcinit = src;
do {
uint64_t r = 0;
for (size_t i = 0; i < 8; i ) {
uint8_t x = table[*src];
src ;
if (x > 31) {
r <<= (5 * (8 - i));
valid = false;
break;
}
r <<= 5;
r |= x;
}
r = (unsigned long int)_bswap64((long long int)r);
uint64_t rs = ((uint64_t)r >> (3 * 8));
memcpy(dst, (const char *)&rs, 8);
dst = 5;
} while (valid);
return (size_t)(src - srcinit);
}
超神版本 SIMD直接贴代码
代码语言:javascript复制size_t base32hex_simd(uint8_t *dst, const uint8_t *src) {
bool valid = true;
const __m128i delta_check =
_mm_setr_epi8(-16, -32, -48, 70, -65, 41, -97, 9, 0, 0, 0, 0, 0, 0, 0, 0);
const __m128i delta_rebase =
_mm_setr_epi8(0, 0, 0, -48, -55, -55, -87, -87, 0, 0, 0, 0, 0, 0, 0, 0);
const uint8_t *srcinit = src;
do {
__m128i v = _mm_loadu_si128((__m128i *)src);
__m128i hash_key = _mm_and_si128(_mm_srli_epi32(v, 4), _mm_set1_epi8(0x0F));
__m128i check = _mm_add_epi8(_mm_shuffle_epi8(delta_check, hash_key), v);
v = _mm_add_epi8(v, _mm_shuffle_epi8(delta_rebase, hash_key));
unsigned int m = (unsigned)_mm_movemask_epi8(check);
if (m) {
int length = __builtin_ctz(m);
if (length == 0) {
break;
}
src = length;
__m128i zero_mask =
_mm_loadu_si128((__m128i *)(zero_masks 16 - length));
v = _mm_andnot_si128(zero_mask, v);
valid = false;
} else { // common case
src = 16;
}
v = _mm_maddubs_epi16(v, _mm_set1_epi32(0x01200120));
v = _mm_madd_epi16(
v, _mm_set_epi32(0x00010400, 0x00104000, 0x00010400, 0x00104000));
// ...00000000`0000eeee`efffffgg`ggghhhhh`00000000`aaaaabbb`bbcccccd`dddd0000
v = _mm_or_si128(v, _mm_srli_epi64(v, 48));
v = _mm_shuffle_epi8(
v, _mm_set_epi8(0, 0, 0, 0, 0, 0, 12, 13, 8, 9, 10, 4, 5, 0, 1, 2));
/* decoded 10 bytes... but write 16 cause why not? */
_mm_storeu_si128((__m128i *)dst, v);
dst = 10;
} while (valid);
return (size_t)(src - srcinit);
}
还有SWAR版本,我直接贴仓库连接,不贴代码了 https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/blob/master/2023/07/20/src/base32.c
- • How to clone a Windows Runtime map in the face of possible concurrent modification, part 2[11]
- • How to clone a Windows Runtime map in the face of possible concurrent modification, part 1[12]
- • How to clone a Windows Runtime vector in the face of possible concurrent modification, part 4[13]
- • Cloning a Windows Runtime vector in the face of possible concurrent modification, denial of service?[14]
- • How to clone a Windows Runtime vector in the face of possible concurrent modification, part 3[15]
raymond chen的window时间,看不懂
开源项目需要人手
- • asteria[16] 一个脚本语言,可嵌入,长期找人,希望胖友们帮帮忙,也可以加群384042845和作者对线
- • Unilang[17] deepin的一个通用编程语言,点子有点意思,也缺人,感兴趣的可以github讨论区或者deepin论坛看一看。这里也挂着长期推荐了
- • gcc-mcf[18] 懂的都懂
新项目介绍/版本更新
- • https://github.com/jgaa/glad 一个基于ASIO的cache server。看个乐呵
- • https://github.com/jgaa/nsblast 一个dns server
本文永久链接[19]
如果有疑问评论最好在上面链接到评论区里评论,这样方便搜索,微信公众号有点封闭/知乎吞评论
引用链接
[1]
周刊项目地址: https://github.com/wanghenshui/cppweeklynews
[2]
提交 issue: https://github.com/wanghenshui/cppweeklynews/issues
[3]
Inside boost::concurrent_flat_map: http://bannalia.blogspot.com/2023/07/inside-boostconcurrentflatmap.html
[4]
Crash in js::StackUses in js::frontend::BytecodeSection::updateDepth(js::frontend::BytecodeOffset) (Samsung CPU Issue): https://bugzilla.mozilla.org/show_bug.cgi?id=1833315
[5]
Peeking under the hood of GCC's __builtin_expect: https://tbrindus.ca/how-builtin-expect-works/
[6]
尝试实现一个Pipeline执行引擎: https://zhuanlan.zhihu.com/p/643452815
[7]
分布式块存储性能调优之PGO: https://zhuanlan.zhihu.com/p/644449363
[8]
基于C 20无栈协程与protobuf的轻量级、高性能Rpc框架: https://zhuanlan.zhihu.com/p/643860021
[9]
Fun with gRPC and C : https://lastviking.eu/fun_with_gRPC_and_C /index.html
[10]
Fast decoding of base32 strings: https://lemire.me/blog/2023/07/20/fast-decoding-of-base32-strings/
[11]
How to clone a Windows Runtime map in the face of possible concurrent modification, part 2: https://devblogs.microsoft.com/oldnewthing/20230720-00/?p=108466
[12]
How to clone a Windows Runtime map in the face of possible concurrent modification, part 1: https://devblogs.microsoft.com/oldnewthing/20230719-00/?p=108462
[13]
How to clone a Windows Runtime vector in the face of possible concurrent modification, part 4: https://devblogs.microsoft.com/oldnewthing/20230718-00/?p=108458
[14]
Cloning a Windows Runtime vector in the face of possible concurrent modification, denial of service?: https://devblogs.microsoft.com/oldnewthing/20230717-00/?p=108454
[15]
How to clone a Windows Runtime vector in the face of possible concurrent modification, part 3: https://devblogs.microsoft.com/oldnewthing/20230714-00/?p=108448
[16]
asteria: https://github.com/lhmouse/asteria
[17]
Unilang: https://github.com/linuxdeepin/unilang
[18]
gcc-mcf: https://gcc-mcf.lhmouse.com/
[19]
本文永久链接: https://wanghenshui.github.io/cppweeklynews/posts/123.html