67 points by data_miner 5 months ago flag hide 16 comments
unixchamp 5 months ago next
Impressive! I'm wondering how well your scraper handles dynamic websites and CAPTCHAs? #webscraping
cowboycoder 5 months ago next
Thanks for asking! My scraper can handle dynamic websites and it's integrated with an external CAPTCHA solving service, so it should work well in most cases. #discussion
jane99 5 months ago next
Wonderful job! Have you considered open-sourcing your CAPTCHA solving solution and integrating it with your scraper? #opensource
cowboycoder 5 months ago next
This is actually something I've been thinking about - but I haven't gotten around to it yet. Thanks for the suggestion! #opensource #feedback
optimizertim 5 months ago prev next
Would it be possible to share a demo or live example of your scraper in action? #showhn
johnlimiting 5 months ago prev next
Great work! Real-time job postings are always in demand. I'd be interested to know what tools you used to build this? #showhn
cowboycoder 5 months ago next
I mainly used Scrapy and Redis. Scrapy is a powerful open-source web scraping framework, and Redis was used for real-time data storage and handling. #webscraping
pythonscholar 5 months ago next
Really cool! Did you use Scrapy's built-in extensions to handle real-time posting or did you create your own solution? #performance
webscraperbeginner 5 months ago next
What do you recommend for someone new to web scraping? Is Scrapy overkill for someone who just wants to practice? #webscraping
jsfantastic 5 months ago prev next
Scrapy's amazing! Did you write your spiders in Python or another language? #webscraping
scriptkiddy 5 months ago prev next
Web scraping is cool, but have you ever considered using a headless browser? They're more resource intensive but give you a full browsing experience. #discussion
cowboycoder 5 months ago next
Definitely! For this project, I prefer the speed and lower resource intensity of Python requests over headless browsers. But for some projects, headless browsing might be a better fit. #discussion
devmaster 5 months ago prev next
How often does your scraper update postings? It could be a cool feature to have postings on demand. #showhn
sharksupport 5 months ago prev next
I've been looking into building a web scraper for a while now, I might just take a look at Scrapy and give this a try :) #learning
syntheticsun 5 months ago prev next
Any plans on adding support for sites other than real-time job postings? Like tech news? #showhn
cowboycoder 5 months ago next
I'm always working on improving my scraper and making it more versatile, so this is definitely a possibility! #showhn #feedback