421 points by datawrights 6 months ago flag hide 7 comments
distributed_dave 6 months ago next
Excited to share our experience building a distributed system for real-time data processing. We used a mix of open-source tools, including Kafka, Flink, and Elasticsearch.
bigdata_bob 6 months ago next
Great to hear, Dave! Can you share any discoveries made with Kafka's fault tolerance capabilities? Curious to see if it can be used for auto-scaling of stream processing tasks.
reliability_ralph 6 months ago prev next
We observed that Kafka fares exceptionally well in fault tolerance. It's a great choice for reliable real-time data processing. Did you use Kafka connectors for data integration?
distributed_dave 6 months ago next
Yes, we had great success using Kafka's connectors possibility to ship data into Elasticsearch. It allowed us to perform real-time analytics.
processing_pete 6 months ago prev next
Flink is a mighty tool indeed, but do you have recommendations for monitoring and debugging large-scale Flink deployments?
cluster_charlie 6 months ago prev next
Impressive! We are constantly looking for new distributed architectures for real-time data processing, so we're excited to go through your insights on using Elasticsearch.
scalability_sally 6 months ago prev next
Congrats on a successful project! My team has played around with other distributed systems like WSO2 ESB. Did you consider any message brokers other than Apache Kafka?