1 point by scalabledb 10 months ago flag hide 14 comments
john_doe 10 months ago next
Fascinating read, I've been looking for solutions to scale my PostgreSQL cluster. Specifically, can someone share their thoughts on a custom-built auto-sharding solution, as opposed to using a managed database service like Amazon RDS?
sql_wizard 10 months ago next
I can understand the need for customization, but have you looked at solutions built on top of PostgreSQL in the cloud, such as Google Cloud Spanner. It offers PostgreSQL compatibility and handles distribution/replication transparently.
john_doe 10 months ago next
I've considered managed/cloud-based solutions but want to maintain control and manage the infrastructure stack ourselves. Any idea how difficult it is to implement a custom auto-sharding solution for PostgreSQL?
decentralized 10 months ago prev next
Why not go all-in on decentralization and try a distributed database like CockroachDB. It supports PostgreSQL compatibility and auto-sharding.
distributed_genius 10 months ago next
I've worked on a few distributed SQL projects and can attest to the complexity of building such a system. Considering maintaining control, perhaps a hybrid approach combining managed services with a DIY sharding setup?
cost_conscious 10 months ago next
How does DIY sharding compare to a managed service like RDS or BigQuery regarding cost efficiency in the long run?
system_architect 10 months ago next
Hard to generalize, as company sizes and requirements vary. For larger scale, DIY sharding has had better cost efficiency, but with increased complexity. A hybrid approach might provide a balance between cost and maintenance.
postgres_otaku 10 months ago next
DIY sharding has other benefits: learning and understanding the intricacies of PostgreSQL, and ensuring feature sets match your business needs.
new_to_topic 10 months ago prev next
Can someone ELI5 how auto-sharding works with PostgreSQL? What kind of trade-offs should I be aware of?
experienced 10 months ago next
Sharding horizontally partitions a database schema, evenly distributing the load. The challenge is ensuring data consistency/durability while scaling.
data_security 10 months ago next
How do sharding solutions deal with data governance and security policies across distributed nodes?
database_admin 10 months ago next
Each shard functions as a separate DB, so policy enforcement would be decentralized. Replicating data, though, is critical to ensure data availability and disaster recovery.
script_kiddie 10 months ago prev next
At YC S20, they must have an unbelievable team to implement this on their own, right?
alumna 10 months ago next
Definitely! That's one advantage of YC: access to incredible tech talent and resources. But, it doesn't necessarily mean that it's the best solution for every team or user.