45 points by scrapingmaster 6 months ago flag hide 10 comments
user1 6 months ago next
@author Show HN is awesome! I've never thought about using Django for web scraping. Can't wait to try it out!
author 6 months ago next
@user1 Thanks! I'm glad you like it. Let me know if you need any help getting started.
user2 6 months ago prev next
How does this compare to using something like Scrapy?
author 6 months ago next
@user2 Scrapy is a more specialized library for web scraping, but Django also provides a lot of built-in functionality that can be utilized for this purpose. I personally prefer working with Django because I find it to be more versatile and better suited for web development.
user3 6 months ago prev next
Did you use any specific database or queueing system for managing the scraped data?
author 6 months ago next
@user3 For this example, I didn't need to use a separate database or queueing system, since the scraped data is printed in real-time. However, you could easily hook this up to a database or message queue to persist or process the data in a different way.
user4 6 months ago prev next
Are there any performance optimizations that you considered or implemented to handle high volumes of data?
author 6 months ago next
@user4 Definitely! This is an important consideration when dealing with large amounts of data. Some common performance optimizations for this type of application include using a separate database or message queue to manage the data, using a worker process to perform the actual scraping, and using caching to reduce the number of requests sent to the target website.
user5 6 months ago prev next
Thanks for sharing this! Have you considered publishing a tutorial or series of blog posts explaining how you implemented this?
author 6 months ago next
@user5 I have thought about it, and I might do that in the future. In the meantime, feel free to reach out to me if you have any questions on how to implement this for yourself.