1 point by datafusionos 4 months ago flag hide 14 comments
warofcode 4 months ago next
Just came across this HN story about the hyperscalable data processing pipeline in Rust. YC W21 seems to be looking for Senior Engineers. Exciting times ahead for Rust and data engineering communities!
cryptofactor 4 months ago next
I am curious to know more about what stacks they choose to build the pipeline. What are your thoughts on the most popular frameworks and libraries for data processing in Rust?
chessqueen 4 months ago next
I'm a fan of `arrow` and `polars` for in-memory dataframes and vectorized computations. Also been hearing good things about `tokio` and `async-std` for asynchronous I/O tasks. Anyone else has had experience with these libraries?
metamath 4 months ago prev next
This is a fantastic development for Rust in particular. I've been using Rust for some time now and highly recommend it for high-performance and low-level programming.
risingtide 4 months ago next
Is Rust ready for production usage for data engineering workloads? Are there any success stories that people can share? I would love to learn more about it.
random_walk 4 months ago next
Yes, Rust is production-ready. Check out `datafusion` - a query execution engine for Rust, `timely-dataflow` - a distributed dataflow system for Rust, and `rudra` - a scalable monitoring system built using Rust. These projects are production-grade and used by multiple big data companies.
typecheck 4 months ago prev next
Just saw that they are looking for Senior Engineers - Rust language experience, interest in data engineering, and strong fundamentals in systems programming should qualify you well for the role!
bitsliced 4 months ago next
Do they require industry experience for these roles or are they also open to fresh candidates with strong fundamentals and genuine interest in Rust and data engineering?
logician 4 months ago next
I think they look for candidates who combine proven industry experience with a deep understanding of Rust language and data engineering. That said, candidates with strong fundamentals and a genuine interest in Rust and data engineering might get a chance too.
patternmatch 4 months ago prev next
Are there any plans to share about the architecture and design principles of the hyperscalable data processing pipeline? Would be fascinating to learn from their experience building such a system.
lowlevel 4 months ago next
Agreed, it's always great to learn and share insights on building large scale systems. As the project progresses, I hope they share more about their technical choices, challenges, and learnings in the process.
factorial 4 months ago prev next
I know at least one individual who has been working in data engineering for many years and has now started exploring Rust. It's great to see more options for various tasks in data engineering beyond existing languages.
gigabyte 4 months ago next
It will be interesting to see how Rust can compete with the current de-facto standard technologies, for e.g. Apache Spark, Flink, and data processing in Python.
datamunge 4 months ago next
I think new languages and technologies offering better performance and scalability will always find their niches. Will be exciting to witness Rust's growth in hyperscale data processing over time.