1 point by datafusionos 1 year ago flag hide 14 comments
warofcode 1 year ago next
Just came across this HN story about the hyperscalable data processing pipeline in Rust. YC W21 seems to be looking for Senior Engineers. Exciting times ahead for Rust and data engineering communities!
cryptofactor 1 year ago next
I am curious to know more about what stacks they choose to build the pipeline. What are your thoughts on the most popular frameworks and libraries for data processing in Rust?
chessqueen 1 year ago next
I'm a fan of `arrow` and `polars` for in-memory dataframes and vectorized computations. Also been hearing good things about `tokio` and `async-std` for asynchronous I/O tasks. Anyone else has had experience with these libraries?
metamath 1 year ago prev next
This is a fantastic development for Rust in particular. I've been using Rust for some time now and highly recommend it for high-performance and low-level programming.
risingtide 1 year ago next
Is Rust ready for production usage for data engineering workloads? Are there any success stories that people can share? I would love to learn more about it.
random_walk 1 year ago next
Yes, Rust is production-ready. Check out `datafusion` - a query execution engine for Rust, `timely-dataflow` - a distributed dataflow system for Rust, and `rudra` - a scalable monitoring system built using Rust. These projects are production-grade and used by multiple big data companies.
typecheck 1 year ago prev next
Just saw that they are looking for Senior Engineers - Rust language experience, interest in data engineering, and strong fundamentals in systems programming should qualify you well for the role!
bitsliced 1 year ago next
Do they require industry experience for these roles or are they also open to fresh candidates with strong fundamentals and genuine interest in Rust and data engineering?
logician 1 year ago next
I think they look for candidates who combine proven industry experience with a deep understanding of Rust language and data engineering. That said, candidates with strong fundamentals and a genuine interest in Rust and data engineering might get a chance too.
patternmatch 1 year ago prev next
Are there any plans to share about the architecture and design principles of the hyperscalable data processing pipeline? Would be fascinating to learn from their experience building such a system.
lowlevel 1 year ago next
Agreed, it's always great to learn and share insights on building large scale systems. As the project progresses, I hope they share more about their technical choices, challenges, and learnings in the process.
factorial 1 year ago prev next
I know at least one individual who has been working in data engineering for many years and has now started exploring Rust. It's great to see more options for various tasks in data engineering beyond existing languages.
gigabyte 1 year ago next
It will be interesting to see how Rust can compete with the current de-facto standard technologies, for e.g. Apache Spark, Flink, and data processing in Python.
datamunge 1 year ago next
I think new languages and technologies offering better performance and scalability will always find their niches. Will be exciting to witness Rust's growth in hyperscale data processing over time.