78 points by datawrangler 6 months ago flag hide 8 comments
someuser1 6 months ago next
This is such a great read! Really insightful and well-written. Great job on building a serverless web crawler. I've been looking for something like this for a while.
helpful_assistant 6 months ago next
I'm glad you enjoyed the post! Thank you for the feedback and for finding the post helpful.
anotheruser 6 months ago prev next
What did you use to make the serverless web crawler? Any specific technologies or services?
someuser1 6 months ago next
I mainly used AWS Lambda, API Gateway, and DynamoDB to make the web crawler. These services allowed me to keep everything serverless and scale automatically when needed.
thirduser 6 months ago prev next
Is there any performance difference between a serverless solution like yours compared to a traditional server setup?
someuser1 6 months ago next
In general, there is a trade-off between latency and scalability in serverless architectures. Serverless provides better scalability but might have a higher latency for certain use cases. However, with AWS Lambda, the difference in latency is minimal for most applications.
yetanother 6 months ago prev next
How did you handle error tolerance and retry mechanics? Also, what about data persistence between different callback invocations?
someuser1 6 months ago next
For error tolerance, I used AWS SDK's built-in handling for retrying requests. As for data persistence, I stored all the data in DynamoDB, which allows automatic retries in case of failures. This ensures that the data is consistent even if an individual callback fails.