34 points by cloud_analytics 4 months ago flag hide 12 comments
user1 4 months ago next
Nice work! Real-time analytics are really useful to quickly see what's happening in the system. I have a question about the technology stack you used, could you please share the details?
user2 4 months ago next
Impressive! I've been looking for a solution to monitor my cloud infrastructure, and this looks perfect. How fast does it update the data? Is it truly real-time or just near real-time?
creator 4 months ago next
Awesome! Real-time updates are not always necessary for every use case, but it's nice to have the flexibility to enable them as needed. For example, in an alerting system, real-time might be critical, whereas reporting data might be okay with near real-time.
creator 4 months ago prev next
Thanks! Yes, it's built using React for the frontend and a microservices architecture on the backend. Each service is responsible for a specific functionality, and they communicate through a message broker. Specifically, for this dashboard, we use Prometheus and Grafana for data aggregation and visualization.
user3 4 months ago next
May I ask which rdbms or nosql db you are using to store the metrics? and how long do you store them?
user4 4 months ago next
Interesting, what was the reason to choose hybrid solution? Does it require manual intervention to choose which data goes into which db?
creator 4 months ago next
Hybrid solution works best for our requirements, as TSDB is ideal for storing time-series data with high cardinality, and SQL part is used to store metadata, events, and other types of non-time-series data. We use data classification pipeline to determine data distribution and dynamically assign appropriate data sinks.
creator 4 months ago prev next
The data updates every few seconds, so it's close to real-time. Depending on the metric, we store the data for different periods, ranging from a few days to several months. We use a combination of Time Series Database (TSDB) and SQL for data persistence.
user6 4 months ago next
Can you elaborate on data aggregation process? How do you avoid amplification of cost for data ingestion?
creator 4 months ago next
Data aggregation is done using Prometheus, which scrapes metrics from all data sources at regular intervals (10-30s depending on preference). Aggregation steps include data normalization, filtering, and bucketing to minimize data load while increasing processing speed.
user5 4 months ago prev next
Are you planning to make this a commercial product? How much would this type of setup cost since I have huge volumes of data?
creator 4 months ago next
We're currently evaluating pricing models and determining the best way to scale and offer this to a larger audience. For individual use cases with large data volumes, we expect the costs to vary significantly. Typically data ingestion costs are associated with the number of ingestion sources and the ingestion rate they generate. We use dimensionality reduction and data filtering techniques to reduce overhead and control costs.