代码语言:javascript复制% cd etcd
% ./scripts/
(cd etcdctl && env GO_BUILD_FLAGS= CGO_ENABLED=0 GO_BUILD_FLAGS= GOOS=darwin GOARCH=amd64 go build -trimpath -installsuffix=cgo -o=../bin/etcdctl .)
SUCCESS: etcd_build (GOARCH=amd64)
代码语言:javascript复制% ./bin/etcd --version
etcd Version: 3.6.0-alpha.0
Git SHA: 8da2a5b
Go Version: go1.19
Go OS/Arch: darwin/amd64
代码语言:javascript复制% export PATH="$PATH:`pwd`/bin"
代码语言:javascript复制% etcd
{"level":"warn","ts":"2023-06-07T09:17:23.681691 0800","caller":"embed/config.go:708","msg":"Running http and grpc server on single port. This is not recommended for production."}
代码语言:javascript复制% etcdctl put greeting "Hello, etcd"
代码语言:javascript复制% etcdctl get greeting
Hello, etcd
- 处理k/v相关的,Services important for dealing with etcd’s key space include
- KV - Creates, updates, fetches, and deletes key-value pairs.
- Watch - Monitors changes to keys.
- Lease - Primitives for consuming client keep-alive messages.
- 处理集群相关的,Services which manage the cluster itself include:
- Auth - Role based authentication mechanism for authenticating users.
- Cluster - Provides membership information and configuration facilities.
- Maintenance - Takes recovery snapshots, defragments the store, and returns per-member status information.
代码语言:javascript复制service KV {
Range(RangeRequest) returns (RangeResponse)
etcd所有的api返回结果里都增加了Response header,包括集群的元信息:All Responses from etcd API have an attached response header which includes cluster metadata for the response。具体内容如下
代码语言:javascript复制message ResponseHeader {
uint64 cluster_id = 1;
uint64 member_id = 2;
int64 revision = 3;
uint64 raft_term = 4;
代码语言:javascript复制message KeyValue {
bytes key = 1;
int64 create_revision = 2;
int64 mod_revision = 3;
int64 version = 4;
bytes value = 5;
int64 lease = 6;
用etcd实现的分布式锁是通过创建版本号来获取锁的所有权。修改版本号用户mvcc场景下检测版本是否冲突,实现cas逻辑的。etcd内部维护了一个64位的集群粒度的计数器,存储的版本号会随着key修改的次数增加,版本号可以作为逻辑上的一个全局锁。给存储的所有更新排序。etcd maintains a 64-bit cluster-wide counter, the store revision, that is incremented each time the key space is modified. The revision serves as a global logical clock, sequentially ordering all updates to the store. The change represented by a new revision is incremental; the data associated with a revision is the data that changed the store. Internally, a new revision means writing the changes to the backend’s B tree, keyed by the incremented revision.
代码语言:javascript复制message RangeRequest {
enum SortOrder {
NONE = 0; // default, no sorting
ASCEND = 1; // lowest target value first
DESCEND = 2; // highest target value first
enum SortTarget {
KEY = 0;
MOD = 3;
VALUE = 4;
bytes key = 1;
bytes range_end = 2;
int64 limit = 3;
int64 revision = 4;
SortOrder sort_order = 5;
SortTarget sort_target = 6;
bool serializable = 7;
bool keys_only = 8;
bool count_only = 9;
int64 min_mod_revision = 10;
int64 max_mod_revision = 11;
int64 min_create_revision = 12;
int64 max_create_revision = 13;
代码语言:javascript复制message RangeResponse {
ResponseHeader header = 1;
repeated mvccpb.KeyValue kvs = 2;
bool more = 3;
int64 count = 4;
代码语言:javascript复制message PutRequest {
bytes key = 1;
bytes value = 2;
int64 lease = 3;
bool prev_kv = 4;
bool ignore_value = 5;
bool ignore_lease = 6;
代码语言:javascript复制message PutResponse {
ResponseHeader header = 1;
mvccpb.KeyValue prev_kv = 2;
etcd把一个事务操作,抽象为一个原子的If/Then/Else模型:A transaction is an atomic If/Then/Else construct over the key-value store.Transactions can be used for protecting keys from unintended concurrent updates, building compare-and-swap operations, and developing higher-level concurrency control.All comparisons are applied atomically; if all comparisons are true, the transaction is said to succeed and etcd applies the transaction’s then / success request block, otherwise it is said to fail and applies the else / failure request block.
代码语言:javascript复制message Compare {
enum CompareResult {
EQUAL = 0;
LESS = 2;
enum CompareTarget {
MOD = 2;
CompareResult result = 1;
// target is the key-value field to inspect for the comparison.
CompareTarget target = 2;
// key is the subject key for the comparison operation.
bytes key = 3;
oneof target_union {
int64 version = 4;
int64 create_revision = 5;
int64 mod_revision = 6;
bytes value = 7;
代码语言:javascript复制message RequestOp {
// request is a union of request types accepted by a transaction.
oneof request {
RangeRequest request_range = 1;
PutRequest request_put = 2;
DeleteRangeRequest request_delete_range = 3;
All together, a transaction is issued with a Txn API call, which takes a TxnRequest:
代码语言:javascript复制message TxnRequest {
repeated Compare compare = 1;
repeated RequestOp success = 2;
repeated RequestOp failure = 3;
代码语言:javascript复制message TxnResponse {
ResponseHeader header = 1;
bool succeeded = 2;
repeated ResponseOp responses = 3;
代码语言:javascript复制message ResponseOp {
oneof response {
RangeResponse response_range = 1;
PutResponse response_put = 2;
DeleteRangeResponse response_delete_range = 3;
message Event {
enum EventType {
PUT = 0;
EventType type = 1;
KeyValue kv = 2;
KeyValue prev_kv = 3;
Watches are long-running requests and use gRPC streams to stream event data.A single watch stream can multiplex many distinct watches by tagging events with per-watch identifiers.
watch的语意实现了三个要素,有序、可靠、原子性。Watches make three guarantees about events:
- Ordered - events are ordered by revision; an event will never appear on a watch if it precedes an event in time that has already been posted.
- Reliable - a sequence of events will never drop any subsequence of events; if there are events ordered in time as a < b < c, then if the watch receives events a and c, it is guaranteed to receive b.
- Atomic - a list of events is guaranteed to encompass complete revisions; updates in the same revision over multiple keys will not be split over several lists of events.
message WatchCreateRequest {
bytes key = 1;
bytes range_end = 2;
int64 start_revision = 3;
bool progress_notify = 4;
enum FilterType {
NOPUT = 0;
repeated FilterType filters = 5;
bool prev_kv = 6;
租约是一种客户端的保活机制,当收不到心跳的时候,就认为客户端挂掉了。Leases are a mechanism for detecting client liveness. The cluster grants leases with a time-to-live. A lease expires if the etcd cluster does not receive a keepAlive within a given TTL period.
代码语言:javascript复制message LeaseGrantRequest {
int64 TTL = 1;
int64 ID = 2;
代码语言:javascript复制message LeaseRevokeRequest {
int64 ID = 1;
Leases are refreshed using a bi-directional stream created with the LeaseKeepAlive API call.