founder at base14 here, the company that is building Scout. Thanks for the feedback. we do something similar for tracing as well, but pgX does a bit more than that - engineers should be able to trace (like you mention) and see and analyse the condition of the DB. for eg - correlate query slowdown to locks, vacuums etc. all on one screen, or a couple of clicks. We are building some specialised explorers like pgX for postgres. Essentially we are building telemetry readers for components that send relevant metrics and logs through to a telemetry data lake. for each component/domain we find from experts what they look at for analysis and incidents, and bring that to a full stack "unified" dashboards/mcp.
Scout is our otel-native observability product (data lake, UI, alerts, analytics, mcp, the works).
what we call pgX in the blog is an add-on to Scout.
founder at base14 here, the company that is building Scout. Thanks for the feedback, I will work on bettering my messaging.
Scout is an otel-native observability platform (data lake, UI, alerts, analytics, mcp, the works). We are building some specialised explorers (suffix 'X' for explorers) like pgX for postgres. Essentially we are building telemetry readers for components that send relevant metrics and logs through to a telemetry data lake. for each component/domain we find from experts what they look at for analysis and incidents, and bring that to a full stack "unified" dashboard. and we go beyond what a regular prometheus endpoint provides. thanks again.
very interesting solution. and great idea to have a playground. would love to know some details on the implementation of the architecture you have shared -
1. how do you query across multiple files, do you have a query engine like data fusion doing that heavy lifting, or is this a custom implementation ?
2. how do you manage a WAL with real time query-ability across files ? have you seen any failures (recent entries missing sort of issues)
Thanks, once again really interesting design and intuitively looks more economical.
Thanks for your feedback, and great questions.
1. We create serverless functions to process each file and then combine the results, optimized for columnar file formats.
2. This is one of our core innovations :) We created custom representations of WALs which help us with query performance and ingesting them quickly.
3. Once a WAL is ingested, it is available for query within a few seconds. So far it has been reliable and we have not had issues with missing data.
ClickHouse recently got the support of TimeSeries table Engine [1]. It is marked as experimental, so yes - early stage.
This engine is quite interesting, the data can be ingested via Prometheus remote write protocol. And read back via Prometheus remote read protocol. But reading back is the weakest part here, because Prometheus remote read requires sending blocks of data back to Prometheus, where Prometheus will unpack those blocks and do the filtering&transformations on its own. As you see, this doesn't allow leveraging the true power of ClickHouse - query performance.
Yes, you can use SQL to read metrics directly from ClickHouse tables. However, many people prefer the simplicity of PromQL compared to the flexibility of SQL. So until ClickHouse gets native PromQL support, it is in the early stages.
Not a fan of bad mouthing other offerings. As @iampims is saying, alerts is a big missing piece. Clickhouse is also a general purpose database for many use cases including analytics, financial servers, ML& Gen AI, fraud, and observability.
Scout is our otel-native observability product (data lake, UI, alerts, analytics, mcp, the works). what we call pgX in the blog is an add-on to Scout.
reply