Operator Definition


Marco Voelz
 

Hey everyone,

 

not sure if you all have seen this, but I really liked this definition which contrasts operators with controllers: https://twitter.com/caffeinepresent/status/1196243905974390784?s=20

 

As Gerred is also in this thread, I'd imagine this is nothing new, just wanted to make sure it is seen/heard.

 

Warm regards

Marco

 

From: <cncf-sig-app-delivery@...> on behalf of Troy Topnik <troy.topnik@...>
Date: Monday, 18. November 2019 at 19:28
To: "cncf-sig-app-delivery@..." <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

On Fri, Nov 8, 2019 at 04:08 PM, Tobias Knaup wrote:

"A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."

I like how concise this is!

I'm actually OK with the original definition Matt proposed. If your application/workload is neither stateful nor complex, why us an operator ?

TT


Reitbauer, Alois
 

We will have an agenda items for this on the next call. Given there is so much discussion I assume this will take some time. However,  finishing this in the next two months sound reasonable.

 

I am in favor of assessing what is already used. This should not limit us in what we propose to be future use.

 

// Alois

 

From: <cncf-sig-app-delivery@...> on behalf of Matt Farina <matt@...>
Date: Monday, 18. November 2019 at 06:21
To: Alois Reitbauer <alois.reitbauer@...>, Noel OConnor <noel.oconnor@...>, Erick Carty <erickcarty@...>
Cc: Chris Short <chris@...>, "Li, Xiang" <x.li@...>, "cncf-sig-app-delivery@..." <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

The more I look at this the more I wonder if we need two things. A simple definition and a long form description. They will serve different audiences with different levels of detail.

 

I think we need to start from the beginning and not jump to definition too early.

 

  • First we need to define which problems we want operators to solve.
  • Then we should define what they do to solve the problem.
  • The final step should then be to define how they do it.
  • Bonus: Describe how this can be implemented on other platforms.

 

Operators have been in the wild for quite some time. We aren't solving for something new. I would hesitate to treat this as a place where we define problems and solutions. Instead, I would try to describe what is already happening. That may change over time and already has as we can see with the original definition talking about stateful apps and we now deal with stateless workloads as well.

 

The TOC kicked a task out to SIG App Delivery. We should try to complete it and get the results back to them. The SIGs are there to help the TOC. I would hope we have something completed by the end of December.

 

Can we pick this up in earnest next week, after the conference going on this week is out of the way?

 

- Matt Farina

 

On Mon, Nov 18, 2019, at 8:41 AM, Reitbauer, Alois wrote:

I think we need to start from the beginning and not jump to definition too early.

 

  • First we need to define which problems we want operators to solve.
  • Then we should define what they do to solve the problem.
  • The final step should then be to define how they do it.
  • Bonus: Describe how this can be implemented on other platforms.

 

Here is my initial take based on the discussion in this thread:

 

Which problem do we solve:

 

Platforms – like Kubernetes -  offer (low-level) mechanisms to install, run and modify workloads running on them. In order to define these workloads, detailed knowledge of the inner workings of the application – the complex workload to be installed – is required. Besides being complex this definition is also platform specific.  Operators abstract these low level concepts by domain specific concepts and can perform these actions based on a domain specific description.  This means somebody can perform tasks like

 

  • Install a three node Redis cluster
  • Upgrade cluster from version X to very Y
  • Uninstall cluster
  • Upgrade storage for cluster

 

Without a detailed understand which actions need to be triggered on the underlying runtime.

 

Beyond these basic operations, more complex tasks can be automated to keep a system healthy. While the above tasks are usually triggered by user, these more advanced tasks are handled automatically in the background.

 

What does an operator do:

 

This abstraction is possible by providing a domain specific configuration files which is passed to the application runtime. The operator then translates these concepts into actions that can be understood by the underlying platform. An operator does this by creating a platform configuration – also referred to as the desired state – and then triggering actions until the system is configured as desired. The operator then constantly checks the actual configuration and makes required adjustments – if for example health checks fails.

 

How is this done in Kubernetes:

 

Operators in Kubernetes are extensions to the Kubernetes API. They are usually stateful and configured using custom resource definitions (CRDs). An operator used a reconciliation loop to query the health and state of all platform workloads it is running and compares it to the desired state based on the domain specific configuration. If any changes are necessary the operator controller automatically triggers these changes. As the operator also provides an extension to the Kubernetes API end-users (operators) can also trigger domain-specific configuration changes which are then again translated into platform specific changes.

 

This hides the complexity of configuring deployments, persistent volume claims, etc. for applications from end-users.  Good implementations will also change their behavior based on specific environment characteristics – e.g. is a service mesh present or not – without requiring specific configuration from the end-user.

 

An end-user in this definition is everybody how did not build the software that is deployed.

 

Bonus: How can this be done on other platforms:

 

In an AWS environment an Operator can be defined using an S3 Bucket for the configuration and a Lambda function which triggers API commands based on configuration changes of the S3 bucket.

 

 

 

 

 

From: <cncf-sig-app-delivery@...> on behalf of Noel OConnor <noel.oconnor@...>
Date: Monday, 11. November 2019 at 18:58
To: Erick Carty <erickcarty@...>
Cc: "chris@..." <chris@...>, "Li, Xiang" <x.li@...>, "cncf-sig-app-delivery@..." <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

"An Operator is an application-specific controller that extends the..."

 

Do we run the risk of confusing the concept of "controller" here.  K8s have controllers, Operators can have a collection of related controllers, I think we need to come up with a better term.

 

On Mon, Nov 11, 2019 at 9:52 AM Erick Carty <erickcarty@...> wrote:

From this current discussion to answer "What is an operator?", the following have been proposed with some edits; if there are others from SIG-Apps or SIG-Apps-Delivery, please feel free to add them...

  • An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex (stateful) (applications) workloads on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.
  • An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions.
  • An operator is a platform extension that creates, configures, and manages the lifecycle of workloads.
    • A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload.
  • An operator is software that manages routine procedures and lifecycle operations of workloads.
    • A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads.

I feel, ultimately, we're all trying to describe the same concepts, but from different angles:  a general (concise) operator definition that doesn't require K8s, and a framework specific to K8s that allows potential automation of workload operatives.  The OperatorHub provides a Capability Levels category that's trying to associate available the Operators with their specific use-case and set the expectations.  It seems that ultimately, we'll want to use an "Auto Pilot" operator type when available;  but there may be a need for customizations so perhaps a co-op play with Helm, Terraform, Ansible, etc. - the notion of "runbooks as code" seems to resonate with some - how would we describe the ability to self-heal closer/personalized to the workload, (perhaps for a separate discussion) what implications does it have on storage, DR, network, etc.

 

* Only 1 Operator today is classified as "Auto Pilot".

 

On Sun, Nov 10, 2019 at 8:06 AM Chris Short via Lists.Cncf.Io <chris=chrisshort.net@...> wrote:

Easy and simple/simply are words to be avoided in definitions, especially in the K8s space. :-)

 

Thank you so much to Matt for driving this discussion. Is there a working definition at this point that incorporates this discussion into Matt's original definition? I feel that recap will help drive this towards done.

 

Chris Short

He/Him/His

 

 

On Sat, Nov 9, 2019 at 10:28 PM Li, Xiang <x.li@...> wrote:

“Easy” is relative. At least Kubernetes makes it easier than some other environments to achieve the defined model. 

 

But yes, managing the complex app is the hard part. No one says k8s can make that part very easy. Managing app is different than achieving the management model.

 

 

 

------------------------------------------------------------------

件人:Gerred Dillon<hello@...>

日 期:20191110 11:03:49

收件人:Xiang Li<x.li@...>

抄 送:cncf-sig-app-delivery<cncf-sig-app-delivery@...>; Tobias Knaup<tobi@...>; Zhang, Lei<lei.zhang@...>

主 Re: [cncf-sig-app-delivery] Operator Definition

 

I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

 

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

 

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

 

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

 

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

 

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.

 

 

 

 

On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:

While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

 

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

 

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...

 

 

 

------------------------------------------------------------------

件人:alexis richardson<alexis@...>

日 期:20191109 21:30:08

收件人:Tobias Knaup<tobi@...>

抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>

主 Re: [cncf-sig-app-delivery] Operator Definition

 

"An operator is software that manages routine procedures and lifecycle

operations of workloads."

 

Isn't also the case that an operator *delegates some of this

responsibility* to the orchestrator?   Or, "leverages the

orchestrator".

 

 

On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:

> +1 for workload and for a concise definition like Devdatta proposed.

> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.

> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:

> "An operator is software that manages routine procedures and lifecycle operations of workloads."

> A Kubernetes operator could be defined with a bit more detail:

> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."

> - Tobi

> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:

>> 

>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.

>> 

>> 

>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:

>>> 

>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.

>>> 

>>> The difference here seems to be: "how to run" vs "what to run".

>>> 

>>> ---

>>> Lei Zhang (Harry)

>>> 

>>> Alibaba Group

>>> 

>>> ------------------Original Mail ------------------

>>> Sender: <cncf-sig-app-delivery@...>

>>> Send Date:Fri Nov 8 10:41:00 2019

>>> Recipients: <cncf-sig-app-delivery@...>

>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition

>>>> 

>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.

>>>> 

>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:

>>>> 

>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."

>>>> 

>>>> 

>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:

>>>> 

>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."

>>>> 

>>>> 

>>>> This has a few benefits:

>>>> 

>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.

>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.

>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.

>>>> 

>>>> 

>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.

>>>> 

>> 

 

 

 

 

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313

 

Attachments:

  • image001.png
  • image002.png

 

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313


Reitbauer, Alois
 

If your application/workload is neither stateful nor complex, why us an operator ?

 

Because you want to hide the low-level details or because it needs domain specific configuration.

 

 

 

From: <cncf-sig-app-delivery@...> on behalf of Troy Topnik <troy.topnik@...>
Date: Monday, 18. November 2019 at 10:29
To: "cncf-sig-app-delivery@..." <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

On Fri, Nov 8, 2019 at 04:08 PM, Tobias Knaup wrote:

"A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."

I like how concise this is!

I'm actually OK with the original definition Matt proposed. If your application/workload is neither stateful nor complex, why us an operator ?

TT

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313


Troy Topnik
 

On Fri, Nov 8, 2019 at 04:08 PM, Tobias Knaup wrote:
"A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
I like how concise this is!

I'm actually OK with the original definition Matt proposed. If your application/workload is neither stateful nor complex, why us an operator ?

TT


Matt Farina
 

The more I look at this the more I wonder if we need two things. A simple definition and a long form description. They will serve different audiences with different levels of detail.

I think we need to start from the beginning and not jump to definition too early.

 

  • First we need to define which problems we want operators to solve.
  • Then we should define what they do to solve the problem.
  • The final step should then be to define how they do it.
  • Bonus: Describe how this can be implemented on other platforms.

Operators have been in the wild for quite some time. We aren't solving for something new. I would hesitate to treat this as a place where we define problems and solutions. Instead, I would try to describe what is already happening. That may change over time and already has as we can see with the original definition talking about stateful apps and we now deal with stateless workloads as well.

The TOC kicked a task out to SIG App Delivery. We should try to complete it and get the results back to them. The SIGs are there to help the TOC. I would hope we have something completed by the end of December.

Can we pick this up in earnest next week, after the conference going on this week is out of the way?

- Matt Farina

On Mon, Nov 18, 2019, at 8:41 AM, Reitbauer, Alois wrote:

I think we need to start from the beginning and not jump to definition too early.

 

  • First we need to define which problems we want operators to solve.
  • Then we should define what they do to solve the problem.
  • The final step should then be to define how they do it.
  • Bonus: Describe how this can be implemented on other platforms.

 

Here is my initial take based on the discussion in this thread:

 

Which problem do we solve:

 

Platforms – like Kubernetes -  offer (low-level) mechanisms to install, run and modify workloads running on them. In order to define these workloads, detailed knowledge of the inner workings of the application – the complex workload to be installed – is required. Besides being complex this definition is also platform specific.  Operators abstract these low level concepts by domain specific concepts and can perform these actions based on a domain specific description.  This means somebody can perform tasks like

 

  • Install a three node Redis cluster
  • Upgrade cluster from version X to very Y
  • Uninstall cluster
  • Upgrade storage for cluster

 

Without a detailed understand which actions need to be triggered on the underlying runtime.

 

Beyond these basic operations, more complex tasks can be automated to keep a system healthy. While the above tasks are usually triggered by user, these more advanced tasks are handled automatically in the background.

 

What does an operator do:

 

This abstraction is possible by providing a domain specific configuration files which is passed to the application runtime. The operator then translates these concepts into actions that can be understood by the underlying platform. An operator does this by creating a platform configuration – also referred to as the desired state – and then triggering actions until the system is configured as desired. The operator then constantly checks the actual configuration and makes required adjustments – if for example health checks fails.

 

How is this done in Kubernetes:

 

Operators in Kubernetes are extensions to the Kubernetes API. They are usually stateful and configured using custom resource definitions (CRDs). An operator used a reconciliation loop to query the health and state of all platform workloads it is running and compares it to the desired state based on the domain specific configuration. If any changes are necessary the operator controller automatically triggers these changes. As the operator also provides an extension to the Kubernetes API end-users (operators) can also trigger domain-specific configuration changes which are then again translated into platform specific changes.

 

This hides the complexity of configuring deployments, persistent volume claims, etc. for applications from end-users.  Good implementations will also change their behavior based on specific environment characteristics – e.g. is a service mesh present or not – without requiring specific configuration from the end-user.

 

An end-user in this definition is everybody how did not build the software that is deployed.

 

Bonus: How can this be done on other platforms:

 

In an AWS environment an Operator can be defined using an S3 Bucket for the configuration and a Lambda function which triggers API commands based on configuration changes of the S3 bucket.

 

 

 

 

 

From: <cncf-sig-app-delivery@...> on behalf of Noel OConnor <noel.oconnor@...>
Date: Monday, 11. November 2019 at 18:58
To: Erick Carty <erickcarty@...>
Cc: "chris@..." <chris@...>, "Li, Xiang" <x.li@...>, "cncf-sig-app-delivery@..." <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

"An Operator is an application-specific controller that extends the..."

 

Do we run the risk of confusing the concept of "controller" here.  K8s have controllers, Operators can have a collection of related controllers, I think we need to come up with a better term.

 

On Mon, Nov 11, 2019 at 9:52 AM Erick Carty <erickcarty@...> wrote:

From this current discussion to answer "What is an operator?", the following have been proposed with some edits; if there are others from SIG-Apps or SIG-Apps-Delivery, please feel free to add them...

  • An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex (stateful) (applications) workloads on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.
  • An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions.
  • An operator is a platform extension that creates, configures, and manages the lifecycle of workloads.
    • A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload.
  • An operator is software that manages routine procedures and lifecycle operations of workloads.
    • A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads.

I feel, ultimately, we're all trying to describe the same concepts, but from different angles:  a general (concise) operator definition that doesn't require K8s, and a framework specific to K8s that allows potential automation of workload operatives.  The OperatorHub provides a Capability Levels category that's trying to associate available the Operators with their specific use-case and set the expectations.  It seems that ultimately, we'll want to use an "Auto Pilot" operator type when available;  but there may be a need for customizations so perhaps a co-op play with Helm, Terraform, Ansible, etc. - the notion of "runbooks as code" seems to resonate with some - how would we describe the ability to self-heal closer/personalized to the workload, (perhaps for a separate discussion) what implications does it have on storage, DR, network, etc.

 



* Only 1 Operator today is classified as "Auto Pilot".

 

On Sun, Nov 10, 2019 at 8:06 AM Chris Short via Lists.Cncf.Io <chris=chrisshort.net@...> wrote:

Easy and simple/simply are words to be avoided in definitions, especially in the K8s space. :-)

 

Thank you so much to Matt for driving this discussion. Is there a working definition at this point that incorporates this discussion into Matt's original definition? I feel that recap will help drive this towards done.

 

Chris Short

He/Him/His

 

 

On Sat, Nov 9, 2019 at 10:28 PM Li, Xiang <x.li@...> wrote:

“Easy” is relative. At least Kubernetes makes it easier than some other environments to achieve the defined model. 

 

But yes, managing the complex app is the hard part. No one says k8s can make that part very easy. Managing app is different than achieving the management model.

 

 


------------------------------------------------------------------
件人:Gerred Dillon<hello@...>
日 期:20191110 11:03:49
收件人:Xiang Li<x.li@...>
抄 送:cncf-sig-app-delivery<cncf-sig-app-delivery@...>; Tobias Knaup<tobi@...>; Zhang, Lei<lei.zhang@...>
主 Re: [cncf-sig-app-delivery] Operator Definition


I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

 

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

 

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

 

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

 

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

 

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.

 




On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:

While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

 

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

 

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...

 

 


------------------------------------------------------------------
件人:alexis richardson<alexis@...>
日 期:20191109 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 


 


The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313

Attachments:
  • image001.png
  • image002.png


Reitbauer, Alois
 

I think we need to start from the beginning and not jump to definition too early.

 

  • First we need to define which problems we want operators to solve.
  • Then we should define what they do to solve the problem.
  • The final step should then be to define how they do it.
  • Bonus: Describe how this can be implemented on other platforms.

 

Here is my initial take based on the discussion in this thread:

 

Which problem do we solve:

 

Platforms – like Kubernetes -  offer (low-level) mechanisms to install, run and modify workloads running on them. In order to define these workloads, detailed knowledge of the inner workings of the application – the complex workload to be installed – is required. Besides being complex this definition is also platform specific.  Operators abstract these low level concepts by domain specific concepts and can perform these actions based on a domain specific description.  This means somebody can perform tasks like

 

  • Install a three node Redis cluster
  • Upgrade cluster from version X to very Y
  • Uninstall cluster
  • Upgrade storage for cluster

 

Without a detailed understand which actions need to be triggered on the underlying runtime.

 

Beyond these basic operations, more complex tasks can be automated to keep a system healthy. While the above tasks are usually triggered by user, these more advanced tasks are handled automatically in the background.

 

What does an operator do:

 

This abstraction is possible by providing a domain specific configuration files which is passed to the application runtime. The operator then translates these concepts into actions that can be understood by the underlying platform. An operator does this by creating a platform configuration – also referred to as the desired state – and then triggering actions until the system is configured as desired. The operator then constantly checks the actual configuration and makes required adjustments – if for example health checks fails.

 

How is this done in Kubernetes:

 

Operators in Kubernetes are extensions to the Kubernetes API. They are usually stateful and configured using custom resource definitions (CRDs). An operator used a reconciliation loop to query the health and state of all platform workloads it is running and compares it to the desired state based on the domain specific configuration. If any changes are necessary the operator controller automatically triggers these changes. As the operator also provides an extension to the Kubernetes API end-users (operators) can also trigger domain-specific configuration changes which are then again translated into platform specific changes.

 

This hides the complexity of configuring deployments, persistent volume claims, etc. for applications from end-users.  Good implementations will also change their behavior based on specific environment characteristics – e.g. is a service mesh present or not – without requiring specific configuration from the end-user.

 

An end-user in this definition is everybody how did not build the software that is deployed.

 

Bonus: How can this be done on other platforms:

 

In an AWS environment an Operator can be defined using an S3 Bucket for the configuration and a Lambda function which triggers API commands based on configuration changes of the S3 bucket.

 

 

 

 

 

From: <cncf-sig-app-delivery@...> on behalf of Noel OConnor <noel.oconnor@...>
Date: Monday, 11. November 2019 at 18:58
To: Erick Carty <erickcarty@...>
Cc: "chris@..." <chris@...>, "Li, Xiang" <x.li@...>, "cncf-sig-app-delivery@..." <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

"An Operator is an application-specific controller that extends the..."

 

Do we run the risk of confusing the concept of "controller" here.  K8s have controllers, Operators can have a collection of related controllers, I think we need to come up with a better term.

 

On Mon, Nov 11, 2019 at 9:52 AM Erick Carty <erickcarty@...> wrote:

From this current discussion to answer "What is an operator?", the following have been proposed with some edits; if there are others from SIG-Apps or SIG-Apps-Delivery, please feel free to add them...

  • An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex (stateful) (applications) workloads on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.
  • An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions.
  • An operator is a platform extension that creates, configures, and manages the lifecycle of workloads.
    • A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload.
  • An operator is software that manages routine procedures and lifecycle operations of workloads.
    • A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads.

I feel, ultimately, we're all trying to describe the same concepts, but from different angles:  a general (concise) operator definition that doesn't require K8s, and a framework specific to K8s that allows potential automation of workload operatives.  The OperatorHub provides a Capability Levels category that's trying to associate available the Operators with their specific use-case and set the expectations.  It seems that ultimately, we'll want to use an "Auto Pilot" operator type when available;  but there may be a need for customizations so perhaps a co-op play with Helm, Terraform, Ansible, etc. - the notion of "runbooks as code" seems to resonate with some - how would we describe the ability to self-heal closer/personalized to the workload, (perhaps for a separate discussion) what implications does it have on storage, DR, network, etc.

 

* Only 1 Operator today is classified as "Auto Pilot".

 

On Sun, Nov 10, 2019 at 8:06 AM Chris Short via Lists.Cncf.Io <chris=chrisshort.net@...> wrote:

Easy and simple/simply are words to be avoided in definitions, especially in the K8s space. :-)

 

Thank you so much to Matt for driving this discussion. Is there a working definition at this point that incorporates this discussion into Matt's original definition? I feel that recap will help drive this towards done.

 

Chris Short

He/Him/His

 

 

On Sat, Nov 9, 2019 at 10:28 PM Li, Xiang <x.li@...> wrote:

“Easy” is relative. At least Kubernetes makes it easier than some other environments to achieve the defined model. 

 

But yes, managing the complex app is the hard part. No one says k8s can make that part very easy. Managing app is different than achieving the management model.

 

 

------------------------------------------------------------------
件人:Gerred Dillon<hello@...>
日 期:20191110 11:03:49
收件人:Xiang Li<x.li@...>
抄 送:cncf-sig-app-delivery<cncf-sig-app-delivery@...>; Tobias Knaup<tobi@...>; Zhang, Lei<lei.zhang@...>
主 Re: [cncf-sig-app-delivery] Operator Definition

I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

 

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

 

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

 

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

 

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

 

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.

 



On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:

While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

 

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

 

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...

 

 

------------------------------------------------------------------
件人:alexis richardson<alexis@...>
日 期:20191109 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> CC: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 

 

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313


Noel OConnor
 

"An Operator is an application-specific controller that extends the..."

Do we run the risk of confusing the concept of "controller" here.  K8s have controllers, Operators can have a collection of related controllers, I think we need to come up with a better term.

On Mon, Nov 11, 2019 at 9:52 AM Erick Carty <erickcarty@...> wrote:
From this current discussion to answer "What is an operator?", the following have been proposed with some edits; if there are others from SIG-Apps or SIG-Apps-Delivery, please feel free to add them...
  • An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex (stateful) (applications) workloads on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.
  • An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions.
  • An operator is a platform extension that creates, configures, and manages the lifecycle of workloads.
    • A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload.
  • An operator is software that manages routine procedures and lifecycle operations of workloads.
    • A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads.
I feel, ultimately, we're all trying to describe the same concepts, but from different angles:  a general (concise) operator definition that doesn't require K8s, and a framework specific to K8s that allows potential automation of workload operatives.  The OperatorHub provides a Capability Levels category that's trying to associate available the Operators with their specific use-case and set the expectations.  It seems that ultimately, we'll want to use an "Auto Pilot" operator type when available;  but there may be a need for customizations so perhaps a co-op play with Helm, Terraform, Ansible, etc. - the notion of "runbooks as code" seems to resonate with some - how would we describe the ability to self-heal closer/personalized to the workload, (perhaps for a separate discussion) what implications does it have on storage, DR, network, etc.

image.png
image.png
* Only 1 Operator today is classified as "Auto Pilot".

On Sun, Nov 10, 2019 at 8:06 AM Chris Short via Lists.Cncf.Io <chris=chrisshort.net@...> wrote:
Easy and simple/simply are words to be avoided in definitions, especially in the K8s space. :-)

Thank you so much to Matt for driving this discussion. Is there a working definition at this point that incorporates this discussion into Matt's original definition? I feel that recap will help drive this towards done.

Chris Short
He/Him/His


On Sat, Nov 9, 2019 at 10:28 PM Li, Xiang <x.li@...> wrote:
“Easy” is relative. At least Kubernetes makes it easier than some other environments to achieve the defined model. 

But yes, managing the complex app is the hard part. No one says k8s can make that part very easy. Managing app is different than achieving the management model.


------------------------------------------------------------------
发件人:Gerred Dillon<hello@...>
日 期:2019年11月10日 11:03:49
收件人:Xiang Li<x.li@...>
抄 送:cncf-sig-app-delivery<cncf-sig-app-delivery@...>; Tobias Knaup<tobi@...>; Zhang, Lei<lei.zhang@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.


On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:


While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...


------------------------------------------------------------------
发件人:alexis richardson<alexis@...>
日 期:2019年11月09日 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> CC: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 




Erick Carty
 

From this current discussion to answer "What is an operator?", the following have been proposed with some edits; if there are others from SIG-Apps or SIG-Apps-Delivery, please feel free to add them...
  • An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex (stateful) (applications) workloads on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.
  • An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions.
  • An operator is a platform extension that creates, configures, and manages the lifecycle of workloads.
    • A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload.
  • An operator is software that manages routine procedures and lifecycle operations of workloads.
    • A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads.
I feel, ultimately, we're all trying to describe the same concepts, but from different angles:  a general (concise) operator definition that doesn't require K8s, and a framework specific to K8s that allows potential automation of workload operatives.  The OperatorHub provides a Capability Levels category that's trying to associate available the Operators with their specific use-case and set the expectations.  It seems that ultimately, we'll want to use an "Auto Pilot" operator type when available;  but there may be a need for customizations so perhaps a co-op play with Helm, Terraform, Ansible, etc. - the notion of "runbooks as code" seems to resonate with some - how would we describe the ability to self-heal closer/personalized to the workload, (perhaps for a separate discussion) what implications does it have on storage, DR, network, etc.

image.png
image.png
* Only 1 Operator today is classified as "Auto Pilot".

On Sun, Nov 10, 2019 at 8:06 AM Chris Short via Lists.Cncf.Io <chris=chrisshort.net@...> wrote:
Easy and simple/simply are words to be avoided in definitions, especially in the K8s space. :-)

Thank you so much to Matt for driving this discussion. Is there a working definition at this point that incorporates this discussion into Matt's original definition? I feel that recap will help drive this towards done.

Chris Short
He/Him/His


On Sat, Nov 9, 2019 at 10:28 PM Li, Xiang <x.li@...> wrote:
“Easy” is relative. At least Kubernetes makes it easier than some other environments to achieve the defined model. 

But yes, managing the complex app is the hard part. No one says k8s can make that part very easy. Managing app is different than achieving the management model.


------------------------------------------------------------------
发件人:Gerred Dillon<hello@...>
日 期:2019年11月10日 11:03:49
收件人:Xiang Li<x.li@...>
抄 送:cncf-sig-app-delivery<cncf-sig-app-delivery@...>; Tobias Knaup<tobi@...>; Zhang, Lei<lei.zhang@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.


On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:


While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...


------------------------------------------------------------------
发件人:alexis richardson<alexis@...>
日 期:2019年11月09日 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> CC: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 




Chris Short <chris@...>
 

Easy and simple/simply are words to be avoided in definitions, especially in the K8s space. :-)

Thank you so much to Matt for driving this discussion. Is there a working definition at this point that incorporates this discussion into Matt's original definition? I feel that recap will help drive this towards done.

Chris Short
He/Him/His


On Sat, Nov 9, 2019 at 10:28 PM Li, Xiang <x.li@...> wrote:
“Easy” is relative. At least Kubernetes makes it easier than some other environments to achieve the defined model. 

But yes, managing the complex app is the hard part. No one says k8s can make that part very easy. Managing app is different than achieving the management model.


------------------------------------------------------------------
发件人:Gerred Dillon<hello@...>
日 期:2019年11月10日 11:03:49
收件人:Xiang Li<x.li@...>
抄 送:cncf-sig-app-delivery<cncf-sig-app-delivery@...>; Tobias Knaup<tobi@...>; Zhang, Lei<lei.zhang@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.


On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:


While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...


------------------------------------------------------------------
发件人:alexis richardson<alexis@...>
日 期:2019年11月09日 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> CC: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 




Li, Xiang
 

“Easy” is relative. At least Kubernetes makes it easier than some other environments to achieve the defined model. 

But yes, managing the complex app is the hard part. No one says k8s can make that part very easy. Managing app is different than achieving the management model.


------------------------------------------------------------------
发件人:Gerred Dillon<hello@...>
日 期:2019年11月10日 11:03:49
收件人:Xiang Li<x.li@...>
抄 送:cncf-sig-app-delivery<cncf-sig-app-delivery@...>; Tobias Knaup<tobi@...>; Zhang, Lei<lei.zhang@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.


On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:


While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...


------------------------------------------------------------------
发件人:alexis richardson<alexis@...>
日 期:2019年11月09日 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> CC: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 




Gerred Dillon
 

I disagree with the last paragraph of this reply - or the characterization in how it applies to existing operators. Strimzi is a CNCF example where these points are made true — and Kubernetes offers a model but doesn’t necessarily make it easy.

That said, the definitions are in line with my understanding of an operator. I want to contribute perspective indirectly with the intent to help this conversation...

The concept of providing state in a declaratively, irregardless of implementation is a key tenet to how we model systems and implement it in Kubernetes. So far this community has defined APIs in terms of Kubernetes-style object models or resources, which we have collectively and silently termed as “declarative” for our use across projects. Maybe there’s something to be said in that, and taking a cue from Cloud Events, determining what we expect an operator to be from the resource model and applying it to implementations?

That’s not to say we should suddenly start writing a CRD, but we should start creating a well-specified model and extract the definition from that.

Reading through it, this sounds still very generic. I’d like to start talking with existing operators and end users to understand their views of how this model could advance the state of services they feel “need” operators and expand it to our pending definition.

I’m happy to own this data collection with the SIGs blessing and push it into the outstanding issue/PR.


On Nov 9, 2019, at 9:27 PM, Li, Xiang <x.li@...> wrote:


While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...


------------------------------------------------------------------
发件人:alexis richardson<alexis@...>
日 期:2019年11月09日 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> CC: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 



Li, Xiang
 

While we could extend Operator concept developed at CoreOS to non Kubernetes platforms, but there are a few implicit key points that we must preserve. Or we are defining something else.

1. Operator keeps the desired state of the managed workload. The API of an operator is declarative not imperative. 

2. Operator is reactive and mostly automatic. It follows the observe, diff, act pattern from Kubernetes generic controllers.

3. Operator embeds the domain knowledge for a specific workload operation into code. This makes an operator different from a controller. 

Kubernetes environment just makes all three easy. Yes, there are so called x-operator in Kube-world but they do not try to achieve the 3 key points. Most of them should be called installer or deployer instead I guess...


------------------------------------------------------------------
发件人:alexis richardson<alexis@...>
日 期:2019年11月09日 21:30:08
收件人:Tobias Knaup<tobi@...>
抄 送:Zhang, Lei<lei.zhang@...>; <cncf-sig-app-delivery@...>
主 题:Re: [cncf-sig-app-delivery] Operator Definition

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator?   Or, "leverages the
orchestrator".


On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:
>
> +1 for workload and for a concise definition like Devdatta proposed.
> I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.
>
> So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:
>
> "An operator is software that manages routine procedures and lifecycle operations of workloads."
>
> A Kubernetes operator could be defined with a bit more detail:
>
> "A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."
>
> - Tobi
>
>
> On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
>>
>> +1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.
>>
>>
>> On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
>>>
>>> After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.
>>>
>>> The difference here seems to be: "how to run" vs "what to run".
>>>
>>> ---
>>> Lei Zhang (Harry)
>>>
>>> Alibaba Group
>>>
>>> ------------------Original Mail ------------------
>>> Sender: <cncf-sig-app-delivery@...>
>>> Send Date:Fri Nov 8 10:41:00 2019
>>> Recipients: <cncf-sig-app-delivery@...>
>>> CC: <cncf-sig-app-delivery@...>
>>> Subject:Re: [cncf-sig-app-delivery] Operator Definition
>>>>
>>>> I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.
>>>>
>>>> For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:
>>>>
>>>> "An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."
>>>>
>>>>
>>>> The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:
>>>>
>>>> "A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."
>>>>
>>>>
>>>> This has a few benefits:
>>>>
>>>> The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
>>>> It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
>>>> Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.
>>>>
>>>>
>>>> Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.
>>>>
>> 



alexis richardson
 

"An operator is software that manages routine procedures and lifecycle
operations of workloads."

Isn't also the case that an operator *delegates some of this
responsibility* to the orchestrator? Or, "leverages the
orchestrator".

On Fri, Nov 8, 2019 at 10:42 PM Tobias Knaup <tobi@...> wrote:

+1 for workload and for a concise definition like Devdatta proposed.
I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.

So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:

"An operator is software that manages routine procedures and lifecycle operations of workloads."

A Kubernetes operator could be defined with a bit more detail:

"A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."

- Tobi


On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:

+1 for workload. I think application will eventually have a definition we can all get behind, and it will be different.


On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:

After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.

The difference here seems to be: "how to run" vs "what to run".

---
Lei Zhang (Harry)

Alibaba Group

------------------Original Mail ------------------
Sender: <cncf-sig-app-delivery@...>
Send Date:Fri Nov 8 10:41:00 2019
Recipients: <cncf-sig-app-delivery@...>
CC: <cncf-sig-app-delivery@...>
Subject:Re: [cncf-sig-app-delivery] Operator Definition

I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.

For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:

"An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."


The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:

"A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."


This has a few benefits:

The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.


Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.


Tobias Knaup
 

+1 for workload and for a concise definition like Devdatta proposed.
I found that most people get it when I explain operators as "runbooks as code". Especially when I talk to folks that don't know advanced concept like CRDs yet.

So based on what was said here earlier and the Wikipedia article on runbooks an operator could be defined as:

"An operator is software that manages routine procedures and lifecycle operations of workloads."

A Kubernetes operator could be defined with a bit more detail:

"A Kubernetes operator is a controller that manages routine procedures and lifecycle operations of workloads."

- Tobi


On Fri, Nov 8, 2019 at 12:09 PM alexis richardson <alexis@...> wrote:
+1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.


On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.

The difference here seems to be: "how to run" vs "what to run".

---
Lei Zhang (Harry)

Alibaba Group
------------------Original Mail ------------------
Send Date:Fri Nov 8 10:41:00 2019
Subject:Re: [cncf-sig-app-delivery] Operator Definition
I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.

For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:

"An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."


The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:

"A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."


This has a few benefits:
  • The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
  • It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
  • Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.

Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.


alexis richardson
 

+1 for workload.  I think application will eventually have a definition we can all get behind, and it will be different.


On Fri, 8 Nov 2019, 19:20 Zhang, Lei, <lei.zhang@...> wrote:
After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.

The difference here seems to be: "how to run" vs "what to run".

---
Lei Zhang (Harry)

Alibaba Group
------------------Original Mail ------------------
Send Date:Fri Nov 8 10:41:00 2019
Subject:Re: [cncf-sig-app-delivery] Operator Definition
I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.

For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:

"An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."


The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:

"A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."


This has a few benefits:
  • The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
  • It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
  • Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.

Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.


Zhang, Lei
 

After reading thru the thread, I tend to agree "workload" reads more accurate than "application" in Operator Framework's case.

The difference here seems to be: "how to run" vs "what to run".

---
Lei Zhang (Harry)

Alibaba Group

------------------Original Mail ------------------
Sender: <cncf-sig-app-delivery@...>
Send Date:Fri Nov 8 10:41:00 2019
Recipients: <cncf-sig-app-delivery@...>
CC: <cncf-sig-app-delivery@...>
Subject:Re: [cncf-sig-app-delivery] Operator Definition
I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.

For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:

"An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."


The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:

"A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."


This has a few benefits:
  • The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
  • It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
  • Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.

Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.


Vaclav Turecek
 

I agree with Matt and Doug that the scope of the definition need not be specific to Kubernetes. Given this SIG is at the CNCF level, aren't we responsible for defining these things broadly across the industry for cloud native in general? We could create a general definition to state what an operator is, and then offer definitions specific to platforms that define how it works on said platform, or leave that up to those projects to define for themselves.

For example, the general definition of operator could be something along the lines of Devdatta's proposal or the first sentence of Matt's proposal without mentioning Kubernetes specifics:

"An operator is a platform extension that creates, configures, and manages the lifecycle of workloads."


The Kubernetes operator definition defines how the pattern works specifically on Kubernetes:

"A Kubernetes operator is an extension to the Kubernetes API that builds upon the basic Kubernetes resource and controller concepts but includes domain-specific knowledge to automate common tasks for a given workload."


This has a few benefits:
  • The general definition can be applied to systems where the pattern already exists using that system's specific constructs and terminology.
  • It should be easy enough to understand the general concept without getting into platform-specific details. "I have a Redis operator for platform X" gets the point across without getting into the details of platform X.
  • Platform-specific definitions can follow the general pattern in a way that is familiar and easy to understand for the platform's users. "I have a Redis operator for Kubernetes" denotes how the operator works specifically in Kubernetes, so you know exactly what this means in the context of Kubernetes, but even if you're not familiar with Kubernetes, you at least have an idea of what it means from the general definition.

Side note: Quinton mentioned earlier that the use of "application" here may be somewhat misleading. I tend to agree with that and so I used "workload" as a more general term.


Devdatta Kulkarni
 

@Matt, @Doug,


I definitely see your points. I just think that we have a good opportunity here to start

with a clean slate and come up with a definition that is generic as well as precise.

In fact, we should first come up with a list of criteria against which any potential

definition(s) can be evaluated against. This will help us suss out what words should/should

not be included in the final definition. Here are some criteria that I can think of:


1) Representativeness: A definition should be representative of current varied ways in which Operators are being implemented in the K8s community.


2) Providing direction for future use: A definition should also help future developers/users/adopters of Operators. Specifically, a definition should point towards the best practices of developing Operators for future/newer use cases.


3) Technical Precision: A definition if too broad will not help with clearly delineating what is and what is not an Operator.


4) Wide understandability: It should resonate with K8s experts, technologists who may not be K8s experts, non-technologists (sales, marketing, analysts, etc.), alike.


5) Short (nice to have): Couple sentences is fine. A paragraph might be too much.


Here is my attempt to evaluate various thoughts/words that have been shared on this thread so far against above criteria.


a) ‘Declarative definitions’ : Might be too technical for some audiences, so fails criteria #4.


b) ’Independent of CRD/Custom Resource’: While technically both these points are valid - 

     (e.g.: ConfigMap can be used to pass the data instead of CR, and a regular Pod can also

      perform the reconciliation functions without any CRD), I believe it fails criteria #2. 


I think our definition should give direction towards a recommended approach of creating K8s Operators. There is a reason CRDs were added to K8s. The Custom Resource abstraction provides better controls than a ConfigMap to pass in the Operator data, and the associated Custom Controller provides better abstraction that understands how to work with Custom Resources than generic controllers deployed as Pods. If we don’t recommend future Operator writers to think in terms of Custom Resources/Custom Controllers, then a valid question

to raise would be why to introduce a new term (‘Operator’) in the first place. The term ‘Controller’ would be just enough. So I think we should be opinionated here and identify the words that will provide some sort of direction as to how an ideal Operator should be implemented.


c) ‘Controller’, ‘Resource’: Fails criteria #4


d) ‘On user behalf’: Not a necessity. Fails criteria #3 and #5.


e) Kubernetes REST API Extension / Kubernetes API Extension: passes all above criteria.



I would be happy to collaboratively work on the definition on a Google doc. 


- Devdatta



From: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...> on behalf of Matt Farina via Lists.Cncf.Io <matt=mattfarina.com@...>
Sent: Thursday, November 7, 2019 9:01 AM
To: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Cc: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition
 
I would also point out that I believe it doesn’t need to be an extension of the Kube API (ala CRD), you could just deploy a single Pod/Deployment to do this management, no?

I think Doug brings up an excellent point.

Where I started from when I put the CoreOS definition out there assumed Kubernetes. But,  technically you could have a Nomad operator and maybe even a Cloud Foundry operator and those would look different in implementation. Not only do you not need to extend the Kubernetes API, Kubernetes isn't really a requirement in the picture even if all of us will use it.

Should we have both a definition of an operator and a common k8s operator?

On Wed, Nov 6, 2019, at 11:58 PM, Doug Davis wrote:
I don’t want Matt to feel alone :-) so I’ll agree with his concern about the latest proposed wording being a bit too.... low-level/geeky for a non-insignificant number of people who will read it.

“ domain-specific workflow actions through declarative definitions”, while probably 100% accurate is something I think too many people will have to read several times, very slowly, to fully grok.  I think you can convey the same net result with slightly different wording, tweaked from Matt’s original text: ”manage the creation and lifecycle of Kubernetes deployed applications”.  Or something like that.

I would also point out that I believe it doesn’t need to be an extension of the Kube API (ala CRD), you could just deploy a single Pod/Deployment to do this management, no?

-Doug




On Nov 6, 2019, at 3:02 PM, Devdatta Kulkarni <devdatta@...> wrote:


The point about ‘whom’ the definition is targeted towards is valid.

As pointed out below there is a wide variety of personas who need to refer to ‘Kubernetes Operators’. Key focus needs to be on those who are not from Kubernetes world but represent the domains for which Kubernetes Operators are being written. For this reason, I believe the definition should not contain Kubernetes specific terms like ‘controller’ or ‘resource’ that may be confusing to the broader community. Generally understood and commonly used terms are better.

This has been our experience when working with enterprises adopting Kubernetes. Terms such as CRD, Custom Resources, Custom Controllers, Resources, Controllers are too Kubernetes-specific for them. They are lost when discussing about Operators using these terms. Instead, what resonates are terms like - REST APIs,  API extension, workflows, etc.

In fact based on the recurrent questions that we have received about Operators,

we ended up creating a Operator FAQ:

https://github.com/cloud-ark/kubeplus/blob/master/Operator-FAQ.md



On the point of API Extension being confused with Aggregated APIs — it is valid.

To accommodate this the alternative suggested definition can be tweaked to:



"An Operator is a Kubernetes REST API extension to automate domain-specific workflow actions

through declarative definitions."


The declarative part is not specific to cloud native to be honest. It has been around since Infrastructure-as-Code systems - ‘declarative’ captures ’how’ the automation is specified and is generally well understood. Other alternatives to 'declarative definitions' can be 'declarative specifications' or 'a declarative model'.


Best regards,

Devdatta





From: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...> on behalf of Matt Farina via Lists.Cncf.Io <matt=mattfarina.com@...>
Sent: Wednesday, November 6, 2019 12:54 PM
To: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Cc: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition
 
For a moment can we consider who will read this and need to understand it. I think the language that's chosen should be useful to them and not just us. Many of the people who need to understand it are not on this list and representative personas or roles for them may not be on this list.

I would suggest the definition needs to make sense for:
  • People who run applications in Kubernetes
  • Managers of those people running applications. This is to that they understand the work (some but not all of them are technical) and they are able to use it to justify the work up the chain in larger organization
  • Sales, support, and other tangential people. Imagine the case where a sales person at Redis is talking to a business manager at a company just coming on board to Kubernetes. Just coming on board means they don't deeply know many of the concepts the way we do. Redis is talking about the Redis Enterprise operator and needs to clearly communicate what operators are. And, if the people on the receiving end go to look it up that needs to jive.

I would personally like to see a definition that works for people who go far beyond those who are close to the tooling.

"An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions."

My fear, and maybe it's unfounded, is that a definition like that would lead to head nodding from many but they may not really understand it. To be honest, I've learned that here are a lot of people who are just approaching cloud native who don't quite get what we mean by declarative until they have been immersed in cloud native for awhile. This was one of the things that drew me to the CoreOS original definition.

Removing complex and stateful from that definition would read...

An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of applications on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.

It doesn't get into the difference between application and system software. When it comes to system software I wouldn't want to go down the rabbit hole of details. For example, do we need to talk about the difference between Kubernetes being the platform and some other platform running in the Kubernetes platform?

Databases, web services, API servers, email servers, and many other things are considered application software. Is there anything on the OperatorHub today that wouldn't fall in that bucket?

- Matt

On Wed, Nov 6, 2019, at 1:39 PM, Gerred Dillon wrote:
That's part of the traditional definition, yes, so I'm all for it - but a lot of the definitions seemed to be moving more generically. Could an aggregate API, while not a CRD, be included in the definition of an operator? I have no opinions here but wanted to raise that issue.

On Wed, Nov 6, 2019 at 1:37 PM Erick Carty <erickcarty@...> wrote:
Second Devdatta's concise definition.

Please correct me if I misunderstood, but I thought Operators allowed accessing/managing CRDs through the API Server (i.e. using kubectl).

On Wed, Nov 6, 2019 at 10:34 AM Gerred Dillon <hello@...> wrote:
I like Devdatta's description a lot about domain + workflow + declarative - though I wouldn't call it an API extension, as that may potentially imply aggregate APIs. It's still a controller first and foremost, and while CRDs are nice, aren't necessarily the only way to capture the data side of the domain.

On Wed, Nov 6, 2019 at 1:30 PM Devdatta Kulkarni <devdatta@...> wrote:
Hello,

I agree with Marc and Nicolas. The definition of an Operator should not be restricted to stateful systems or applications.

Here is another more generic definition option:

"An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions."

This allows the term Operator to be used in wide variety of situations. Also,
by not explicitly calling out 'controller' or 'resource', it avoids too much of details
in a definition. At the same time having 'Kubernetes API extension' in it alludes to the fact that an Operator actually extends Kubernetes's control plane.

Best regards,
Devdatta
 



From:cncf-sig-app-delivery@... <cncf-sig-app-delivery@...> on behalf of Marc Campbell via Lists.Cncf.Io <marc=replicated.com@...>
Sent: Wednesday, November 6, 2019 12:03 PM
To: nicolas.trangez@... <nicolas.trangez@...>
Cc: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition
 

The qualification of "Stateful application" feels restrictive to me also. We've built operators that don't manage any state, but still call them "operators" because they codify and package a domain or application-specific knowledge. 



-Marc


On Wed, Nov 06, 2019 at 9:56 AM, Nicolas Trangez <nicolas.trangez=scality.com@...> wrote:

Hello,

On Wed, 2019-11-06 at 12:46 -0500, Matt Farina wrote:

In the TOC meeting yesterday (11/5) there was a question about the definition of an operator that was kicked back to SIG App Delivery. In the app delivery call today, at least before I had to drop, no definition had been proposed but the topic was discussed. So, I would like to propose a definition.

An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex stateful applications on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.


Why would this only be for 'stateful' applications, or even
'applications' in general?


Some cases I have in mind:


Non-stateful applications:
- An operator that deploys some application which itself doesn't have any stateful components, but can be configured to connect to some existing/external system for persistence where needed.
- An operator that deploys some application which itself *does* come with a stateful component, but delegates the job of managing this stateful component (say, a database) to *another* operator (using its CR(D)s) which in turn does whatever is needed.



Non-applications:
- Maybe a bit of a corner-case which doesn't necessarily *must* fall under the 'operator' definition: we have a system in place which, given CR instances, performs operations fully outside of Kubernetes at first
(provisioning certain hardware), then inside Kubernetes (making this hardware available to the in-cluster workloads). Not strictly related to app-delivery, but if there's some definition of 'operator', it shouldn't only cover what app-delivery expects from them.


A similar case would be ClusterAPI implementations that launch VMs or whatnot.

This is taken directly from the original introduction to operators by CoreOS <https://coreos.com/blog/introducing-operators.html> when they documented the concept.

In general, I think the concept has grown beyond the initial scope/goal...

The Kubernetes documentation has a long description of operators < https://kubernetes.io/docs/concepts/extend-kubernetes/operator/>but I think the original definition is fairly clear and concise. For a definition I would suggest something short and to the point.

Agree.

Nicolas











Matt Farina
 

I would also point out that I believe it doesn’t need to be an extension of the Kube API (ala CRD), you could just deploy a single Pod/Deployment to do this management, no?

I think Doug brings up an excellent point.

Where I started from when I put the CoreOS definition out there assumed Kubernetes. But,  technically you could have a Nomad operator and maybe even a Cloud Foundry operator and those would look different in implementation. Not only do you not need to extend the Kubernetes API, Kubernetes isn't really a requirement in the picture even if all of us will use it.

Should we have both a definition of an operator and a common k8s operator?

On Wed, Nov 6, 2019, at 11:58 PM, Doug Davis wrote:
I don’t want Matt to feel alone :-) so I’ll agree with his concern about the latest proposed wording being a bit too.... low-level/geeky for a non-insignificant number of people who will read it.

“ domain-specific workflow actions through declarative definitions”, while probably 100% accurate is something I think too many people will have to read several times, very slowly, to fully grok.  I think you can convey the same net result with slightly different wording, tweaked from Matt’s original text: ”manage the creation and lifecycle of Kubernetes deployed applications”.  Or something like that.

I would also point out that I believe it doesn’t need to be an extension of the Kube API (ala CRD), you could just deploy a single Pod/Deployment to do this management, no?

-Doug




On Nov 6, 2019, at 3:02 PM, Devdatta Kulkarni <devdatta@...> wrote:


The point about ‘whom’ the definition is targeted towards is valid.

As pointed out below there is a wide variety of personas who need to refer to ‘Kubernetes Operators’. Key focus needs to be on those who are not from Kubernetes world but represent the domains for which Kubernetes Operators are being written. For this reason, I believe the definition should not contain Kubernetes specific terms like ‘controller’ or ‘resource’ that may be confusing to the broader community. Generally understood and commonly used terms are better.

This has been our experience when working with enterprises adopting Kubernetes. Terms such as CRD, Custom Resources, Custom Controllers, Resources, Controllers are too Kubernetes-specific for them. They are lost when discussing about Operators using these terms. Instead, what resonates are terms like - REST APIs,  API extension, workflows, etc.

In fact based on the recurrent questions that we have received about Operators,

we ended up creating a Operator FAQ:

https://github.com/cloud-ark/kubeplus/blob/master/Operator-FAQ.md



On the point of API Extension being confused with Aggregated APIs — it is valid.

To accommodate this the alternative suggested definition can be tweaked to:



"An Operator is a Kubernetes REST API extension to automate domain-specific workflow actions

through declarative definitions."


The declarative part is not specific to cloud native to be honest. It has been around since Infrastructure-as-Code systems - ‘declarative’ captures ’how’ the automation is specified and is generally well understood. Other alternatives to 'declarative definitions' can be 'declarative specifications' or 'a declarative model'.


Best regards,

Devdatta





From: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...> on behalf of Matt Farina via Lists.Cncf.Io <matt=mattfarina.com@...>
Sent: Wednesday, November 6, 2019 12:54 PM
To: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Cc: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition
 
For a moment can we consider who will read this and need to understand it. I think the language that's chosen should be useful to them and not just us. Many of the people who need to understand it are not on this list and representative personas or roles for them may not be on this list.

I would suggest the definition needs to make sense for:
  • People who run applications in Kubernetes
  • Managers of those people running applications. This is to that they understand the work (some but not all of them are technical) and they are able to use it to justify the work up the chain in larger organization
  • Sales, support, and other tangential people. Imagine the case where a sales person at Redis is talking to a business manager at a company just coming on board to Kubernetes. Just coming on board means they don't deeply know many of the concepts the way we do. Redis is talking about the Redis Enterprise operator and needs to clearly communicate what operators are. And, if the people on the receiving end go to look it up that needs to jive.

I would personally like to see a definition that works for people who go far beyond those who are close to the tooling.

"An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions."

My fear, and maybe it's unfounded, is that a definition like that would lead to head nodding from many but they may not really understand it. To be honest, I've learned that here are a lot of people who are just approaching cloud native who don't quite get what we mean by declarative until they have been immersed in cloud native for awhile. This was one of the things that drew me to the CoreOS original definition.

Removing complex and stateful from that definition would read...

An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of applications on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.

It doesn't get into the difference between application and system software. When it comes to system software I wouldn't want to go down the rabbit hole of details. For example, do we need to talk about the difference between Kubernetes being the platform and some other platform running in the Kubernetes platform?

Databases, web services, API servers, email servers, and many other things are considered application software. Is there anything on the OperatorHub today that wouldn't fall in that bucket?

- Matt

On Wed, Nov 6, 2019, at 1:39 PM, Gerred Dillon wrote:
That's part of the traditional definition, yes, so I'm all for it - but a lot of the definitions seemed to be moving more generically. Could an aggregate API, while not a CRD, be included in the definition of an operator? I have no opinions here but wanted to raise that issue.

On Wed, Nov 6, 2019 at 1:37 PM Erick Carty <erickcarty@...> wrote:
Second Devdatta's concise definition.

Please correct me if I misunderstood, but I thought Operators allowed accessing/managing CRDs through the API Server (i.e. using kubectl).

On Wed, Nov 6, 2019 at 10:34 AM Gerred Dillon <hello@...> wrote:
I like Devdatta's description a lot about domain + workflow + declarative - though I wouldn't call it an API extension, as that may potentially imply aggregate APIs. It's still a controller first and foremost, and while CRDs are nice, aren't necessarily the only way to capture the data side of the domain.

On Wed, Nov 6, 2019 at 1:30 PM Devdatta Kulkarni <devdatta@...> wrote:
Hello,

I agree with Marc and Nicolas. The definition of an Operator should not be restricted to stateful systems or applications.

Here is another more generic definition option:

"An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions."

This allows the term Operator to be used in wide variety of situations. Also,
by not explicitly calling out 'controller' or 'resource', it avoids too much of details
in a definition. At the same time having 'Kubernetes API extension' in it alludes to the fact that an Operator actually extends Kubernetes's control plane.

Best regards,
Devdatta
 



From:cncf-sig-app-delivery@... <cncf-sig-app-delivery@...> on behalf of Marc Campbell via Lists.Cncf.Io <marc=replicated.com@...>
Sent: Wednesday, November 6, 2019 12:03 PM
To: nicolas.trangez@... <nicolas.trangez@...>
Cc: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition
 

The qualification of "Stateful application" feels restrictive to me also. We've built operators that don't manage any state, but still call them "operators" because they codify and package a domain or application-specific knowledge. 



-Marc


On Wed, Nov 06, 2019 at 9:56 AM, Nicolas Trangez <nicolas.trangez=scality.com@...> wrote:

Hello,

On Wed, 2019-11-06 at 12:46 -0500, Matt Farina wrote:

In the TOC meeting yesterday (11/5) there was a question about the definition of an operator that was kicked back to SIG App Delivery. In the app delivery call today, at least before I had to drop, no definition had been proposed but the topic was discussed. So, I would like to propose a definition.

An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex stateful applications on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.


Why would this only be for 'stateful' applications, or even
'applications' in general?


Some cases I have in mind:


Non-stateful applications:
- An operator that deploys some application which itself doesn't have any stateful components, but can be configured to connect to some existing/external system for persistence where needed.
- An operator that deploys some application which itself *does* come with a stateful component, but delegates the job of managing this stateful component (say, a database) to *another* operator (using its CR(D)s) which in turn does whatever is needed.



Non-applications:
- Maybe a bit of a corner-case which doesn't necessarily *must* fall under the 'operator' definition: we have a system in place which, given CR instances, performs operations fully outside of Kubernetes at first
(provisioning certain hardware), then inside Kubernetes (making this hardware available to the in-cluster workloads). Not strictly related to app-delivery, but if there's some definition of 'operator', it shouldn't only cover what app-delivery expects from them.


A similar case would be ClusterAPI implementations that launch VMs or whatnot.

This is taken directly from the original introduction to operators by CoreOS <https://coreos.com/blog/introducing-operators.html> when they documented the concept.

In general, I think the concept has grown beyond the initial scope/goal...

The Kubernetes documentation has a long description of operators < https://kubernetes.io/docs/concepts/extend-kubernetes/operator/>but I think the original definition is fairly clear and concise. For a definition I would suggest something short and to the point.

Agree.

Nicolas











Reitbauer, Alois
 

I also want to track these discussions in issues:

I created one for the operator definition 
https://github.com/cncf/sig-app-delivery/issues/15

and one for the Operator Framework/Hub submission 
https://github.com/cncf/sig-app-delivery/issues/14

// Alois

 

 

From: <cncf-sig-app-delivery@...> on behalf of Doug Davis <dug@...>
Date: Thursday, 7. November 2019 at 06:01
To: Devdatta Kulkarni <devdatta@...>
Cc: "Matt Farina via Lists.Cncf.Io" <matt=mattfarina.com@...>, "cncf-sig-app-delivery@..." <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

I don’t want Matt to feel alone :-) so I’ll agree with his concern about the latest proposed wording being a bit too.... low-level/geeky for a non-insignificant number of people who will read it.

 

“ domain-specific workflow actions through declarative definitions”, while probably 100% accurate is something I think too many people will have to read several times, very slowly, to fully grok.  I think you can convey the same net result with slightly different wording, tweaked from Matt’s original text: ”manage the creation and lifecycle of Kubernetes deployed applications”.  Or something like that.

 

I would also point out that I believe it doesn’t need to be an extension of the Kube API (ala CRD), you could just deploy a single Pod/Deployment to do this management, no?

 

-Doug

 



On Nov 6, 2019, at 3:02 PM, Devdatta Kulkarni <devdatta@...> wrote:

The point about ‘whom’ the definition is targeted towards is valid.

As pointed out below there is a wide variety of personas who need to refer to ‘Kubernetes Operators’. Key focus needs to be on those who are not from Kubernetes world but represent the domains for which Kubernetes Operators are being written. For this reason, I believe the definition should not contain Kubernetes specific terms like ‘controller’ or ‘resource’ that may be confusing to the broader community. Generally understood and commonly used terms are better.

This has been our experience when working with enterprises adopting Kubernetes. Terms such as CRD, Custom Resources, Custom Controllers, Resources, Controllers are too Kubernetes-specific for them. They are lost when discussing about Operators using these terms. Instead, what resonates are terms like - REST APIs,  API extension, workflows, etc.

In fact based on the recurrent questions that we have received about Operators,

we ended up creating a Operator FAQ:

https://github.com/cloud-ark/kubeplus/blob/master/Operator-FAQ.md

 

 

On the point of API Extension being confused with Aggregated APIs — it is valid.

To accommodate this the alternative suggested definition can be tweaked to:

 

 

"An Operator is a Kubernetes REST API extension to automate domain-specific workflow actions

through declarative definitions."

 

The declarative part is not specific to cloud native to be honest. It has been around since Infrastructure-as-Code systems - ‘declarative’ captures ’how’ the automation is specified and is generally well understood. Other alternatives to 'declarative definitions' can be 'declarative specifications' or 'a declarative model'.

 

Best regards,

Devdatta

 


From: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...> on behalf of Matt Farina via Lists.Cncf.Io <matt=mattfarina.com@...>
Sent: Wednesday, November 6, 2019 12:54 PM
To: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Cc: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

For a moment can we consider who will read this and need to understand it. I think the language that's chosen should be useful to them and not just us. Many of the people who need to understand it are not on this list and representative personas or roles for them may not be on this list.

 

I would suggest the definition needs to make sense for:

  • People who run applications in Kubernetes
  • Managers of those people running applications. This is to that they understand the work (some but not all of them are technical) and they are able to use it to justify the work up the chain in larger organization
  • Sales, support, and other tangential people. Imagine the case where a sales person at Redis is talking to a business manager at a company just coming on board to Kubernetes. Just coming on board means they don't deeply know many of the concepts the way we do. Redis is talking about the Redis Enterprise operator and needs to clearly communicate what operators are. And, if the people on the receiving end go to look it up that needs to jive.

 

I would personally like to see a definition that works for people who go far beyond those who are close to the tooling.

 

"An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions."

 

My fear, and maybe it's unfounded, is that a definition like that would lead to head nodding from many but they may not really understand it. To be honest, I've learned that here are a lot of people who are just approaching cloud native who don't quite get what we mean by declarative until they have been immersed in cloud native for awhile. This was one of the things that drew me to the CoreOS original definition.

 

Removing complex and stateful from that definition would read...

 

An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of applications on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.

 

It doesn't get into the difference between application and system software. When it comes to system software I wouldn't want to go down the rabbit hole of details. For example, do we need to talk about the difference between Kubernetes being the platform and some other platform running in the Kubernetes platform?

 

Databases, web services, API servers, email servers, and many other things are considered application software. Is there anything on the OperatorHub today that wouldn't fall in that bucket?

 

- Matt

 

On Wed, Nov 6, 2019, at 1:39 PM, Gerred Dillon wrote:

That's part of the traditional definition, yes, so I'm all for it - but a lot of the definitions seemed to be moving more generically. Could an aggregate API, while not a CRD, be included in the definition of an operator? I have no opinions here but wanted to raise that issue.

 

On Wed, Nov 6, 2019 at 1:37 PM Erick Carty <erickcarty@...> wrote:

Second Devdatta's concise definition.

 

Please correct me if I misunderstood, but I thought Operators allowed accessing/managing CRDs through the API Server (i.e. using kubectl).

 

On Wed, Nov 6, 2019 at 10:34 AM Gerred Dillon <hello@...> wrote:

I like Devdatta's description a lot about domain + workflow + declarative - though I wouldn't call it an API extension, as that may potentially imply aggregate APIs. It's still a controller first and foremost, and while CRDs are nice, aren't necessarily the only way to capture the data side of the domain.

 

On Wed, Nov 6, 2019 at 1:30 PM Devdatta Kulkarni <devdatta@...> wrote:

Hello,

 

I agree with Marc and Nicolas. The definition of an Operator should not be restricted to stateful systems or applications.

 

Here is another more generic definition option:

 

"An Operator is a Kubernetes API extension to automate domain-specific workflow actions through declarative definitions."

 

This allows the term Operator to be used in wide variety of situations. Also,

by not explicitly calling out 'controller' or 'resource', it avoids too much of details

in a definition. At the same time having 'Kubernetes API extension' in it alludes to the fact that an Operator actually extends Kubernetes's control plane.

 

Best regards,

Devdatta

 

 


 

From:cncf-sig-app-delivery@... <cncf-sig-app-delivery@...> on behalf of Marc Campbell via Lists.Cncf.Io <marc=replicated.com@...>
Sent: Wednesday, November 6, 2019 12:03 PM
To: nicolas.trangez@... <nicolas.trangez@...>
Cc: cncf-sig-app-delivery@... <cncf-sig-app-delivery@...>
Subject: Re: [cncf-sig-app-delivery] Operator Definition

 

 

The qualification of "Stateful application" feels restrictive to me also. We've built operators that don't manage any state, but still call them "operators" because they codify and package a domain or application-specific knowledge. 

 

 

 

-Marc

 

 

On Wed, Nov 06, 2019 at 9:56 AM, Nicolas Trangez <nicolas.trangez=scality.com@...> wrote:

 

Hello,

On Wed, 2019-11-06 at 12:46 -0500, Matt Farina wrote:

In the TOC meeting yesterday (11/5) there was a question about the definition of an operator that was kicked back to SIG App Delivery. In the app delivery call today, at least before I had to drop, no definition had been proposed but the topic was discussed. So, I would like to propose a definition.

An Operator is an application-specific controller that extends the Kubernetes API to create, configure, and manage instances of complex stateful applications on behalf of a Kubernetes user. It builds upon the basic Kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common tasks.

 

Why would this only be for 'stateful' applications, or even

'applications' in general?

 

Some cases I have in mind:

 

Non-stateful applications:

- An operator that deploys some application which itself doesn't have any stateful components, but can be configured to connect to some existing/external system for persistence where needed.

- An operator that deploys some application which itself *does* come with a stateful component, but delegates the job of managing this stateful component (say, a database) to *another* operator (using its CR(D)s) which in turn does whatever is needed.

 

 

Non-applications:

- Maybe a bit of a corner-case which doesn't necessarily *must* fall under the 'operator' definition: we have a system in place which, given CR instances, performs operations fully outside of Kubernetes at first

(provisioning certain hardware), then inside Kubernetes (making this hardware available to the in-cluster workloads). Not strictly related to app-delivery, but if there's some definition of 'operator', it shouldn't only cover what app-delivery expects from them.

 

A similar case would be ClusterAPI implementations that launch VMs or whatnot.

This is taken directly from the original introduction to operators by CoreOS <https://coreos.com/blog/introducing-operators.html> when they documented the concept.

In general, I think the concept has grown beyond the initial scope/goal...

The Kubernetes documentation has a long description of operators < https://kubernetes.io/docs/concepts/extend-kubernetes/operator/>but I think the original definition is fairly clear and concise. For a definition I would suggest something short and to the point.

Agree.

Nicolas

 

 

 

 

 

 

 

The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4040 Linz, Austria, Freistädterstraße 313