https://github.com/json-iterator/go是一个非常优秀的go json解析库,完全兼容官方的json解析库。相对于官方的解析器,它的优化点在于:
1,单次扫描:所有解析都是在字节数组流中直接在一次传递中完成的。readInt或readString一次完成,并没有做json的token切分,直接读取字符,转换成目标类型,readFloat或readDouble都以这种方式实现。避免重复扫描的同时,也最大限度避免了内存的申请和释放。
2,它不解析令牌,然后分支。相反,它是先将目标需要绑定的golang对象类型和对应的解析器解析出来,并缓存。然后遍历json串的时候,对取出来的每个key,结合json当前上下文,去map里取对应的解析器,去解析并绑定值。
3,对于不需要解析的字段,会跳过它所有的嵌套对象,因为匹配不到解析器,避免不必要的解析。跳过整个对象时,我们不关心嵌套字段名称
4,绑定到对象不使用反射api。而是取出原始指针interface{},然后转换为正确的指针类型以设置值。例如:*((*int)(ptr)) = iter.ReadInt()
5,尽量避免map的分配和寻址,对于小于等于10个字段的结构体,通过计算key的hash的方式,分配每个字段的结构体和对应的解析函数,这样解析到key的时候,直接通过hash值的匹配,避免了字符串匹配和map的分配,以及匹配。
总之通过上述一系列优化,使得它的反序列化性能,在特定场景下比官方的标准库能够快10倍。当然也有很多网友对此数据表示质疑,所以分析源码之前,拿它提供的benchmark跑了下数据: https://github.com/json-iterator/go-benchmark/blob/master/src/github.com/json-iterator/go-benchmark/benchmark_medium_payload_test.go
在不改变son-iterator 提供的样例数据的情况下,跑出来的效果是惊人的。
代码语言:javascript复制var mediumFixture []byte = []byte(`{
"person": {
"id": "d50887ca-a6ce-4e59-b89f-14f0b5d03b03",
"name": {
"fullName": "Leonid Bugaev",
"givenName": "Leonid",
"familyName": "Bugaev"
},
"email": "leonsbox@gmail.com",
"gender": "male",
"location": "Saint Petersburg, Saint Petersburg, RU",
"geo": {
"city": "Saint Petersburg",
"state": "Saint Petersburg",
"country": "Russia",
"lat": 59.9342802,
"lng": 30.3350986
},
"bio": "Senior engineer at Granify.com",
"site": "http://flickfaver.com",
"avatar": "https://d1ts43dypk8bqh.cloudfront.net/v1/avatars/d50887ca-a6ce-4e59-b89f-14f0b5d03b03",
"employment": {
"name": "www.latera.ru",
"title": "Software Engineer",
"domain": "gmail.com"
},
"facebook": {
"handle": "leonid.bugaev"
},
"github": {
"handle": "buger",
"id": 14009,
"avatar": "https://avatars.githubusercontent.com/u/14009?v=3",
"company": "Granify",
"blog": "http://leonsbox.com",
"followers": 95,
"following": 10
},
"twitter": {
"handle": "flickfaver",
"id": 77004410,
"bio": null,
"followers": 2,
"following": 1,
"statuses": 5,
"favorites": 0,
"location": "",
"site": "http://flickfaver.com",
"avatar": null
},
"linkedin": {
"handle": "in/leonidbugaev"
},
"googleplus": {
"handle": null
},
"angellist": {
"handle": "leonid-bugaev",
"id": 61541,
"bio": "Senior engineer at Granify.com",
"blog": "http://buger.github.com",
"site": "http://buger.github.com",
"followers": 41,
"avatar": "https://d1qb2nb5cznatu.cloudfront.net/users/61541-medium_jpg?1405474390"
},
"klout": {
"handle": null,
"score": null
},
"foursquare": {
"handle": null
},
"aboutme": {
"handle": "leonid.bugaev",
"bio": null,
"avatar": null
},
"gravatar": {
"handle": "buger",
"urls": [
],
"avatar": "http://1.gravatar.com/avatar/f7c8edd577d13b8930d5522f28123510",
"avatars": [
{
"url": "http://1.gravatar.com/avatar/f7c8edd577d13b8930d5522f28123510",
"type": "thumbnail"
}
]
},
"fuzzy": false
},
"company": null
}`)
它里面有很多不需要解析的字段,这些字段官方库解析起来比较慢,下面是跑出来的结果
代码语言:javascript复制lib decode encode
std 156737 ns/op 2392 ns/op
jsoniter 18733 ns/op 2435 ns/op
easyjson 45686 ns/op 1793 ns/op
但是稍微调整下benchmark的数据,发现效果差别会很大
代码语言:javascript复制var mediumFixture1 []byte = []byte(`{
"person": {
"id": "d50887ca-a6ce-4e59-b89f-14f0b5d03b03",
"name": {
"fullName": "Leonid Bugaev",
"givenName": "Leonid",
"familyName": "Bugaev"
},
"github": {
"handle": "buger",
"id": 14009,
"avatar": "https://avatars.githubusercontent.com/u/14009?v=3",
"company": "Granify",
"blog": "http://leonsbox.com",
"followers": 95,
"following": 10
},
"gravatar": {
"handle": "buger",
"urls": [
],
"avatar": "http://1.gravatar.com/avatar/f7c8edd577d13b8930d5522f28123510",
"avatars": [
{
"url": "http://1.gravatar.com/avatar/f7c8edd577d13b8930d5522f28123510",
"type": "thumbnail"
}
]
},
"fuzzy": false
},
"company": null
}`)
调整后的数据如下:
代码语言:javascript复制lib decode encode
std 9301 ns/op 953.3 ns/op
jsoniter 2262 ns/op 913.9 ns/op
easyjson 2757 ns/op 733.2 ns/op
结果也告诫我们,其实不要盲目信任benchmark,每个库都有自己擅长的和不擅长的,提供benchmark的人,都有意或者无意有偏向性。还是需要根据自己的业务场景进行实际压测,跑bench来得出相应的结论。
话说回来,虽然改了下bench,效果差了很多,但是比起官方的库,性能优化还是非常可观的。下面我们研究下如何使用它。
代码语言:javascript复制import (
"fmt"
jsoniter "github.com/json-iterator/go"
)
type ColorGroup struct {
ID int
Name string
Colors []string
}
func main() {
var json = jsoniter.ConfigCompatibleWithStandardLibrary
group := ColorGroup{
ID: 1,
Name: "Reds",
Colors: []string{"Crimson", "Red", "Ruby", "Maroon"},
}
b, err := jsoniter.Marshal(group)
fmt.Println(string(b), err)
val := []byte(`{"ID":1,"Name":"Reds","Colors":["Crimson","Red","Ruby","Maroon"]}`)
fmt.Println(jsoniter.Get(val, "Colors", 0).ToString())
data := ColorGroup{}
fmt.Println(data, json.Unmarshal(b, &data), data)
}
默认情况下,它的接口api和官方api一致,初始化 var json = jsoniter.ConfigCompatibleWithStandardLibrary对象后就可以和官方api一样来使用它了。关于它的源码,我们后面接着分析。