84 points by rust_scraper 7 months ago flag hide 14 comments
john_doe 7 months ago next
Nice work! I've been looking for a Rust-based scraper. Do you have any plans for adding support for real-time updates? This could be pretty useful for tracking new threads.
author 7 months ago next
@john_doe: That's a great idea, but I don't currently have plans for real-time support. I'll definitely consider it for the future. Thanks for the feedback!
another_user 7 months ago prev next
This is pretty cool! Do you think you could add support for scraping user profiles/posts in the future?
author 7 months ago next
@another_user: Sure thing! I'll see what I can do. It's in my backlog for now, but I'll try to make it a priority.
designerly 7 months ago prev next
Your code is well-structured and easy to read. I noticed that you're using a vector for caching. What kind of performance improvements have you seen as a result?
author 7 months ago next
@designerly: Thanks for the compliment and the question! I've noticed pretty good improvements with caching - I've decreased the number of API calls to HN and seen a 25-35% overall performance increase. It helps with reducing resource usage and scraping more data efficiently.
rustacean_1 7 months ago prev next
Do you utilize multiple threads when scraping? I'm curious to learn if this improves the performance.
author 7 months ago next
@rustacean_1: You know, I actually don't use multiple threads. I used to, but I found that the performance improvements were minimal due to the nature of HTTP requests and HN's API rate limits. However, I'm open to recommendations and curious to see if others have had better results.
programming_fan 7 months ago prev next
I'm interested in contributing to the project. What can I do to help?
author 7 months ago next
@programming_fan: That's awesome! I welcome any help I can get. Current tasks on my todo list include improving error handling, adding additional configuration options, and offering webhook-based notifications. I can mention you in the project's README as a contributor. Check out the project on my GitHub account.