Welcome new TOC members!
I didn't participate in some recent project graduation votes because I didn't feel I had adequate information to make a decision. In one case, due diligence that had been performed hadn't been documented or presented. In another, the content of the application (basically a checklist and a list of users) didn't seem sufficient, despite nominally meeting our criteria.
Our current criteria are here:
There is a proposal to add a security audit to the requirements, which is a good step:
But I think we need to start with revisiting what we want graduation to mean to users, and then ensure that the criteria ensure those attributes. I should also add that whatever criteria we come up with, we should ensure the CNCF helps projects meet those criteria.
Our criteria imply that we want users to be able to use the projects in relatively critical (probably should be defined) so-called "production" use cases. How should we ensure that is the case?
I've recently heard from a user that they didn't think most software in the ecosystem was usable in production due to lack of scalability, reliability, security, and other issues. I also heard from a security engineer that they wouldn't trust most open source due to lack of rigorous review processes, especially of dependencies. Within Kubernetes, we've found that CVEs don't appear to be tracked for Golang libraries. Does wide usage of a project suggest that these issues have been overcome? That's not clear to me, particularly since Kubernetes itself needs plenty of improvement.
I've started to look more stringent CII criteria:
One possible approach is for us to require the gold standard, and then work with CII to ensure it covers some of the relevant criteria, or to define an even more rigorous "platinum" level.
We also might want a scalability standard. Is 100 nodes/instances/something sufficiently scalable? 1000?
I also assume we want users to value the CNCF graduated status. As is, it's hard for an external observer to tell whether we're a rubber stamp or made a well informed decision. Perhaps it's worth providing a rationale/justification statement rather than just "+1".
Thoughts?
|
|
I think “production ready” has a lot to do with known, accurate and well-documented limitations, as opposed to no limitations at all, or vague claims, or an absolute set of metrics that need to be met (e.g. around scalability) for all projects.
As a contrived concrete example, a project that reliably and demonstrably scales to say 100 nodes, and clearly publishes data supporting that fact, might be perfectly production-ready for a user that has no intention of ever exceeding that scale for their
use of that project (and vastly more appealing than another project that makes vague and exaggerated claims about scalability, which turn out not to be true in practical use cases).
For that reason I like the CII model, which is more about clearing articulating what’s there, and what’s not, than it is about checking off a bunch of “must-have" checkboxes. Clearly there will be at least a few “must-have” checkboxes, but I think there
will be vastly more “do we understand and clearly document this limitation” type items. And then the overall question around whether, given the known limitations, a project is useful for a sufficiently significant set of production use cases.
Until now, we have tended to use the number and size of claimed or actual production use cases as an approximation of the answer to the aforementioned question.
Q
toggle quoted message
Show quoted text
Welcome new TOC members!
I didn't participate in some recent project graduation votes because I didn't feel I had adequate information to make a decision. In one case, due diligence that had been performed hadn't been documented or presented. In another, the content
of the application (basically a checklist and a list of users) didn't seem sufficient, despite nominally meeting our criteria.
Our current criteria are here:
There is a proposal to add a security audit to the requirements, which is a good step:
But I think we need to start with revisiting what we want graduation to mean to users, and then ensure that the criteria ensure those attributes. I should also add that whatever criteria we come up with, we should ensure the CNCF helps projects meet those
criteria.
Our criteria imply that we want users to be able to use the projects in relatively critical (probably should be defined) so-called "production" use cases. How should we ensure that is the case?
I've recently heard from a user that they didn't think most software in the ecosystem was usable in production due to lack of scalability, reliability, security, and other issues. I also heard from a security engineer that they wouldn't trust most open source
due to lack of rigorous review processes, especially of dependencies. Within Kubernetes, we've found that CVEs don't appear to be tracked for Golang libraries.
Does wide usage of a project suggest that these issues have been overcome? That's not clear to me, particularly since Kubernetes itself needs plenty of improvement.
I've started to look more stringent CII criteria:
One possible approach is for us to require the gold standard, and then work with CII to ensure it covers some of the relevant criteria, or to define an even more rigorous "platinum" level.
We also might want a scalability standard. Is 100 nodes/instances/something sufficiently scalable? 1000?
I also assume we want users to value the CNCF graduated status. As is, it's hard for an external observer to tell whether we're a rubber stamp or made a well informed decision. Perhaps it's worth providing a rationale/justification statement rather than
just "+1".
Thoughts?
|
|
I would be weary of requiring quantification of scalability or performance as a graduation project requirement, mainly because in my experience it's virtually impossible to accurately do so given the number of different configurations and deployment targets (on prem, IaaS, different HW types, etc.) that end users end up deploying the software to.
I would much rather see us get crisper around the underlying attributes that we care about and be more clear about the requirements (not leave so many things up to the "judgement" of the TOC). These are attributes like security auditing and CVE procedures, CI, test coverage % and type of test coverage (unit tests, integration tests, fuzz tests, etc.), code review procedures, stability of the master branch, PR review/merge throughput, number of end users, scale of end users, number of full-time maintainers, number of maintainer orgs, governance and dispute resolution model, etc.
I have lots of opinions on how I would personally quantify some of the things I listed above as a graduation requirement, but it's probably more prudent to start with a list of things that we care to quantify.
toggle quoted message
Show quoted text
I think “production ready” has a lot to do with known, accurate and well-documented limitations, as opposed to no limitations at all, or vague claims, or an absolute set of metrics that need to be met (e.g. around scalability) for all projects.
As a contrived concrete example, a project that reliably and demonstrably scales to say 100 nodes, and clearly publishes data supporting that fact, might be perfectly production-ready for a user that has no intention of ever exceeding that scale for their
use of that project (and vastly more appealing than another project that makes vague and exaggerated claims about scalability, which turn out not to be true in practical use cases).
For that reason I like the CII model, which is more about clearing articulating what’s there, and what’s not, than it is about checking off a bunch of “must-have" checkboxes. Clearly there will be at least a few “must-have” checkboxes, but I think there
will be vastly more “do we understand and clearly document this limitation” type items. And then the overall question around whether, given the known limitations, a project is useful for a sufficiently significant set of production use cases.
Until now, we have tended to use the number and size of claimed or actual production use cases as an approximation of the answer to the aforementioned question.
Q
Welcome new TOC members!
I didn't participate in some recent project graduation votes because I didn't feel I had adequate information to make a decision. In one case, due diligence that had been performed hadn't been documented or presented. In another, the content
of the application (basically a checklist and a list of users) didn't seem sufficient, despite nominally meeting our criteria.
Our current criteria are here:
There is a proposal to add a security audit to the requirements, which is a good step:
But I think we need to start with revisiting what we want graduation to mean to users, and then ensure that the criteria ensure those attributes. I should also add that whatever criteria we come up with, we should ensure the CNCF helps projects meet those
criteria.
Our criteria imply that we want users to be able to use the projects in relatively critical (probably should be defined) so-called "production" use cases. How should we ensure that is the case?
I've recently heard from a user that they didn't think most software in the ecosystem was usable in production due to lack of scalability, reliability, security, and other issues. I also heard from a security engineer that they wouldn't trust most open source
due to lack of rigorous review processes, especially of dependencies. Within Kubernetes, we've found that CVEs don't appear to be tracked for Golang libraries.
Does wide usage of a project suggest that these issues have been overcome? That's not clear to me, particularly since Kubernetes itself needs plenty of improvement.
I've started to look more stringent CII criteria:
One possible approach is for us to require the gold standard, and then work with CII to ensure it covers some of the relevant criteria, or to define an even more rigorous "platinum" level.
We also might want a scalability standard. Is 100 nodes/instances/something sufficiently scalable? 1000?
I also assume we want users to value the CNCF graduated status. As is, it's hard for an external observer to tell whether we're a rubber stamp or made a well informed decision. Perhaps it's worth providing a rationale/justification statement rather than
just "+1".
Thoughts?
|
|
Matt, most of the things you mention are fairly well covered by the CII Best Practices program, a basic version of which is currently a CNCF graduation requirement, as Brian detailed below.
I would encourage you all to take a walk through a CII Best Practises declaration if you haven’t already done so – it’s reasonably good and comprehensive, IMO. I was pleasantly surprised.
As for not requiring quantification of performance or scalability as a graduation criterion, I hear your concern, but I think even a limited form of that (e.g. “publish repeatable performance and scalability results for one or more representative configurations
and deployment targets”) would be vastly better than the nothing that we have now.
Q
toggle quoted message
Show quoted text
I would be weary of requiring quantification of scalability or performance as a graduation project requirement, mainly because in my experience it's virtually impossible to accurately do so given the number of different configurations and deployment
targets (on prem, IaaS, different HW types, etc.) that end users end up deploying the software to.
I would much rather see us get crisper around the underlying attributes that we care about and be more clear about the requirements (not leave so many things up to the "judgement" of the TOC). These are attributes like security auditing and CVE procedures,
CI, test coverage % and type of test coverage (unit tests, integration tests, fuzz tests, etc.), code review procedures, stability of the master branch, PR review/merge throughput, number of end users, scale of end users, number of full-time maintainers, number
of maintainer orgs, governance and dispute resolution model, etc.
I have lots of opinions on how I would personally quantify some of the things I listed above as a graduation requirement, but it's probably more prudent to start with a list of things that we care to quantify.
I think “production ready” has a lot to do with known, accurate and well-documented limitations, as opposed to no limitations at all, or vague claims, or an absolute set of metrics that need to be met (e.g. around scalability) for all projects.
As a contrived concrete example, a project that reliably and demonstrably scales to say 100 nodes, and clearly publishes data supporting that fact, might be perfectly production-ready for a user that has no intention of ever exceeding that scale for their
use of that project (and vastly more appealing than another project that makes vague and exaggerated claims about scalability, which turn out not to be true in practical use cases).
For that reason I like the CII model, which is more about clearing articulating what’s there, and what’s not, than it is about checking off a bunch of “must-have" checkboxes. Clearly there will be at least a few “must-have” checkboxes, but I think there
will be vastly more “do we understand and clearly document this limitation” type items. And then the overall question around whether, given the known limitations, a project is useful for a sufficiently significant set of production use cases.
Until now, we have tended to use the number and size of claimed or actual production use cases as an approximation of the answer to the aforementioned question.
Q
Welcome new TOC members!
I didn't participate in some recent project graduation votes because I didn't feel I had adequate information to make a decision. In one case, due diligence that had been performed hadn't been documented or presented. In another, the content
of the application (basically a checklist and a list of users) didn't seem sufficient, despite nominally meeting our criteria.
Our current criteria are here:
There is a proposal to add a security audit to the requirements, which is a good step:
But I think we need to start with revisiting what we want graduation to mean to users, and then ensure that the criteria ensure those attributes. I should also add that whatever criteria we come up with, we should ensure the CNCF helps projects meet those
criteria.
Our criteria imply that we want users to be able to use the projects in relatively critical (probably should be defined) so-called "production" use cases. How should we ensure that is the case?
I've recently heard from a user that they didn't think most software in the ecosystem was usable in production due to lack of scalability, reliability, security, and other issues. I also heard from a security engineer that they wouldn't trust most open source
due to lack of rigorous review processes, especially of dependencies. Within Kubernetes, we've found that CVEs don't appear to be tracked for Golang libraries.
Does wide usage of a project suggest that these issues have been overcome? That's not clear to me, particularly since Kubernetes itself needs plenty of improvement.
I've started to look more stringent CII criteria:
One possible approach is for us to require the gold standard, and then work with CII to ensure it covers some of the relevant criteria, or to define an even more rigorous "platinum" level.
We also might want a scalability standard. Is 100 nodes/instances/something sufficiently scalable? 1000?
I also assume we want users to value the CNCF graduated status. As is, it's hard for an external observer to tell whether we're a rubber stamp or made a well informed decision. Perhaps it's worth providing a rationale/justification statement rather than
just "+1".
Thoughts?
|
|
I would encourage you all to take a walk through a CII Best Practises declaration if you haven’t already done so – it’s reasonably good and comprehensive, IMO. I was pleasantly surprised.
I agree it's a great start. Personally I think we should come up with a custom variant the tweaks some of the values and focuses on some key common elements.
As for not requiring quantification of performance or scalability as a graduation criterion, I hear your concern, but I think even a limited form of that (e.g. “publish repeatable performance and scalability results for one or more representative configurations and deployment targets”) would be vastly better than the nothing that we have now.
I think if we want to go down this road (which I still don't recommend), it would require CNCF funding for contractors to setup a repeatable performance profiling framework, hook it up to CI, etc. This is an extremely non-trivial undertaking that can take months of people-effort to do correctly.
Matt, most of the things you mention are fairly well covered by the CII Best Practices program, a basic version of which is currently a CNCF graduation requirement, as Brian detailed below.
I would encourage you all to take a walk through a CII Best Practises declaration if you haven’t already done so – it’s reasonably good and comprehensive, IMO. I was pleasantly surprised.
As for not requiring quantification of performance or scalability as a graduation criterion, I hear your concern, but I think even a limited form of that (e.g. “publish repeatable performance and scalability results for one or more representative configurations
and deployment targets”) would be vastly better than the nothing that we have now.
Q
I would be weary of requiring quantification of scalability or performance as a graduation project requirement, mainly because in my experience it's virtually impossible to accurately do so given the number of different configurations and deployment
targets (on prem, IaaS, different HW types, etc.) that end users end up deploying the software to.
I would much rather see us get crisper around the underlying attributes that we care about and be more clear about the requirements (not leave so many things up to the "judgement" of the TOC). These are attributes like security auditing and CVE procedures,
CI, test coverage % and type of test coverage (unit tests, integration tests, fuzz tests, etc.), code review procedures, stability of the master branch, PR review/merge throughput, number of end users, scale of end users, number of full-time maintainers, number
of maintainer orgs, governance and dispute resolution model, etc.
I have lots of opinions on how I would personally quantify some of the things I listed above as a graduation requirement, but it's probably more prudent to start with a list of things that we care to quantify.
I think “production ready” has a lot to do with known, accurate and well-documented limitations, as opposed to no limitations at all, or vague claims, or an absolute set of metrics that need to be met (e.g. around scalability) for all projects.
As a contrived concrete example, a project that reliably and demonstrably scales to say 100 nodes, and clearly publishes data supporting that fact, might be perfectly production-ready for a user that has no intention of ever exceeding that scale for their
use of that project (and vastly more appealing than another project that makes vague and exaggerated claims about scalability, which turn out not to be true in practical use cases).
For that reason I like the CII model, which is more about clearing articulating what’s there, and what’s not, than it is about checking off a bunch of “must-have" checkboxes. Clearly there will be at least a few “must-have” checkboxes, but I think there
will be vastly more “do we understand and clearly document this limitation” type items. And then the overall question around whether, given the known limitations, a project is useful for a sufficiently significant set of production use cases.
Until now, we have tended to use the number and size of claimed or actual production use cases as an approximation of the answer to the aforementioned question.
Q
Welcome new TOC members!
I didn't participate in some recent project graduation votes because I didn't feel I had adequate information to make a decision. In one case, due diligence that had been performed hadn't been documented or presented. In another, the content
of the application (basically a checklist and a list of users) didn't seem sufficient, despite nominally meeting our criteria.
Our current criteria are here:
There is a proposal to add a security audit to the requirements, which is a good step:
But I think we need to start with revisiting what we want graduation to mean to users, and then ensure that the criteria ensure those attributes. I should also add that whatever criteria we come up with, we should ensure the CNCF helps projects meet those
criteria.
Our criteria imply that we want users to be able to use the projects in relatively critical (probably should be defined) so-called "production" use cases. How should we ensure that is the case?
I've recently heard from a user that they didn't think most software in the ecosystem was usable in production due to lack of scalability, reliability, security, and other issues. I also heard from a security engineer that they wouldn't trust most open source
due to lack of rigorous review processes, especially of dependencies. Within Kubernetes, we've found that CVEs don't appear to be tracked for Golang libraries.
Does wide usage of a project suggest that these issues have been overcome? That's not clear to me, particularly since Kubernetes itself needs plenty of improvement.
I've started to look more stringent CII criteria:
One possible approach is for us to require the gold standard, and then work with CII to ensure it covers some of the relevant criteria, or to define an even more rigorous "platinum" level.
We also might want a scalability standard. Is 100 nodes/instances/something sufficiently scalable? 1000?
I also assume we want users to value the CNCF graduated status. As is, it's hard for an external observer to tell whether we're a rubber stamp or made a well informed decision. Perhaps it's worth providing a rationale/justification statement rather than
just "+1".
Thoughts?
|
|
I would encourage you all to take a walk through a CII Best Practises declaration if you haven’t already done so – it’s reasonably good and comprehensive, IMO. I was pleasantly surprised.
I agree it's a great start. Personally I think we should come up with a custom variant the tweaks some of the values and focuses on some key common elements.
I recruited David Wheeler to create the Best Practices Badge for CII and worked with him to create both the passing and higher milestone levels and build the BadgeApp. David is open to making adjustments to the criteria, although there are obviously issues around backward compatibility for existing badge holders.
We're waiting for David to confirm that he can present at the 2/19 TOC meeting if you'd like to hear from him. We've also arranged for him to attend OSLS and I would encourage you to meet with him there. --
|
|