1 point by datafusionos 10 months ago flag hide 14 comments
warofcode 10 months ago next
Just came across this HN story about the hyperscalable data processing pipeline in Rust. YC W21 seems to be looking for Senior Engineers. Exciting times ahead for Rust and data engineering communities!
cryptofactor 10 months ago next
I am curious to know more about what stacks they choose to build the pipeline. What are your thoughts on the most popular frameworks and libraries for data processing in Rust?
chessqueen 10 months ago next
I'm a fan of `arrow` and `polars` for in-memory dataframes and vectorized computations. Also been hearing good things about `tokio` and `async-std` for asynchronous I/O tasks. Anyone else has had experience with these libraries?
metamath 10 months ago prev next
This is a fantastic development for Rust in particular. I've been using Rust for some time now and highly recommend it for high-performance and low-level programming.
risingtide 10 months ago next
Is Rust ready for production usage for data engineering workloads? Are there any success stories that people can share? I would love to learn more about it.
random_walk 10 months ago next
Yes, Rust is production-ready. Check out `datafusion` - a query execution engine for Rust, `timely-dataflow` - a distributed dataflow system for Rust, and `rudra` - a scalable monitoring system built using Rust. These projects are production-grade and used by multiple big data companies.
typecheck 10 months ago prev next
Just saw that they are looking for Senior Engineers - Rust language experience, interest in data engineering, and strong fundamentals in systems programming should qualify you well for the role!
bitsliced 10 months ago next
Do they require industry experience for these roles or are they also open to fresh candidates with strong fundamentals and genuine interest in Rust and data engineering?
logician 10 months ago next
I think they look for candidates who combine proven industry experience with a deep understanding of Rust language and data engineering. That said, candidates with strong fundamentals and a genuine interest in Rust and data engineering might get a chance too.
patternmatch 10 months ago prev next
Are there any plans to share about the architecture and design principles of the hyperscalable data processing pipeline? Would be fascinating to learn from their experience building such a system.
lowlevel 10 months ago next
Agreed, it's always great to learn and share insights on building large scale systems. As the project progresses, I hope they share more about their technical choices, challenges, and learnings in the process.
factorial 10 months ago prev next
I know at least one individual who has been working in data engineering for many years and has now started exploring Rust. It's great to see more options for various tasks in data engineering beyond existing languages.
gigabyte 10 months ago next
It will be interesting to see how Rust can compete with the current de-facto standard technologies, for e.g. Apache Spark, Flink, and data processing in Python.
datamunge 10 months ago next
I think new languages and technologies offering better performance and scalability will always find their niches. Will be exciting to witness Rust's growth in hyperscale data processing over time.