Re: Jepsen result for etcd 3.4.3


Brandon Philips <brandon.philips@...>
 

On Fri, Jan 31, 2020 at 1:35 AM Alexis Richardson <alexis@...> wrote:
On Fri, Jan 31, 2020 at 1:00 AM Brandon Philips
<brandon.philips@...> wrote:
>
> On Thu, Jan 30, 2020 at 7:40 AM Alexis Richardson <alexis@...> wrote:
>>
>> One side question from me -- I think it would be good to understand
>> more about recommended etcd set-ups at different scales of k8s cluster
>> (10, 50, 150, 500+ nodes) and how to deal with n/w partitions.
>
>
> There are many variables on how much workload an API server puts on etcd but there is a ballpark guess doc here:
>
> https://etcd.io/docs/v3.4.0/op-guide/hardware/

This is great, I don't know how I missed it before.  QQ: Are your
example configs each using only one VM per cluster?  That is not
totally clear from the docs.

The example configs are for the etcd nodes needed to support a Kubernetes cluster of X size. Because etcd doesn't scale horizontally the guide, I believe, covers recommended sized clusters up to 5 nodes. Happy to take a PR to clarify something though.
 
> What do you mean by dealing with network partitions? Does the failure guide section on network partitions help? https://etcd.io/docs/v3.4.0/op-guide/failures/

Thank-you.  I've seen this --- I guess what I am asking about is
practical guidance on how often to expect different types of network
failures, and what to expect, how long etc.  I understand this is hard
to do in a general manner.

That all depends on the network not on etcd. All etcd can guarantee is it will tolerate a network partition of arbitrary length of time and will only allow writes on the side of the partition where the majority of members can still talk to each other. Once the partition recovers health checks 

As far as tuning for a particular network see the tuning guide: https://etcd.io/docs/v3.4.0/tuning/

Thank You,

Brandon

 

Join cncf-toc@lists.cncf.io to automatically receive all group messages.