Karthik Ramasamy <kramasamy@...>
Thanks Chris. Apologies for responding late.toggle quoted messageShow quoted text
- Heron is designed to be container friendly. Currently we run Heron as a cgroup container in Mesos/Aurora (which is our production environment).
- There are already two PRs which extends Heron to use docker so that a Heron job can be run as a collection of docker instances.
- While we did just Storm API for Twitter’s needs, Heron design is extensible - in the sense, we can map any API on top of Heron easily. We have a clear distinction between DAG generation and DAG execution. This is the subject of a paper that got accepted in ICDE 2017. Happy to share a copy if needed.
- We are also in the process of implementing exactly once (which requires state storage). Current implementations of exactly once is very messy in other streaming systems since they use Hadoop and very difficult to achieve low latency. If Kubernetes supports container portability (with storage), we will be first one to take advantage of it. I know one company that implements containers with storage portability (robinsystems.com).
- Heron also runs in AWS in the Fabric division of Twitter (which was acquired by Google) a couple of months ago. We are in the process of making it natively run in ECS (EC2 docker container service) at AWS due to request by a customer. This is pretty straight forward to implement due to extensible design of Heron.
- Finally, we published performance numbers for Heron after some simple optimizations. We can do a latency of 20 ms (but in reality we have pushed it 13 ms) and also high throughput. The blog is https://blog.twitter.com/2017/optimizing-twitter-heron. - Heron is the fastest and low latency engine in the market right now and we have another 4-5x to go. Heron provides the price/performance as of now.
Let me know if you have any questions.