Welcome to the September 2023 edition of the pganalyze newsletter!
Last month we've had the honor of joining some of our favorite podcasts (Stack Overflow podcast, O11ycast & others!), to talk about Postgres performance optimization, database observability best practices, the role of data platform teams, how to scale Postgres expertise, and more. See direct links to all podcast episodes later below.
We've also released the first version of a pganalyze feature we're very excited about: The new OpenTelemetry tracing integration for combining application-side tracing info ("which of my app requests was slow, and why") with Postgres EXPLAIN plans. This adds a direct link to pganalyze from within your application observability tools, enabling easier performance optimization work for the whole team. Read more below, or give it a try!
And of course as always you can find our 5mins of Postgres episodes (including two on the recently released Postgres 16), as well as recent pganalyze product improvements in this newsletter.
All the best,
Lukas
Exporting Postgres EXPLAIN plans as traces to OpenTelemetry pipelines with pganalyze
Over the last couple of years, the OpenTelemetry standard has emerged as a unified way of collecting traces, metrics and logs from applications in a vendor-agnostic way. And as the most mature part of it, the tracing component of OpenTelemetry works well for connecting data points across different services.
What if we treated the database as another service, that is producing tracing data for slow query plans? This would allow us to have a unified view of application performance, including specifics on database slowdowns.
There are a couple of steps to make this work:
1. Sending trace parent information to the database. We can rely on the "traceparent" method of propagating context across services, by having applications add it to queries as a query tag in a SQL comment. You can utilize existing libraries such as sqlcommenter to enable this in your application.
2. Tracking detailed query plans for slow queries with the auto_explain extension (supported by all released Postgres versions), and logging them to the Postgres error log, preserving the trace parent as a query comment.
3. Parsing the Postgres logs using the pganalyze collector, and having it construct a span for each collected EXPLAIN plan, with the appropriate parent span configured, and then sending it to an OTLP endpoint.
The good news: We have a first version of this ready for testing! You can try this out today with the new pganalyze-collector 0.52.0 release, and give application teams a direct link to the full query plan, and tuning recommendations by pganalyze, right within existing application tracing tools, such as Honeycomb or New Relic!
For more background on why this matters, you can listen to the recent O11ycast episode, where Lukas spoke with Charity and Jess from Honeycomb on the benefits of combining database and application observability.
We'll have more details on how this works in the next couple of weeks, with specific instructions for different APM tools - in the meantime, don't hesitate to reach out with questions or feedback!
Ben and Ryan talk with Lukas on why Lukas considers himself a founder-CEO by title but engineer by trade, what’s important to understand about optimizing slow SQL queries, and the role of AI in database optimization.
Charity and Jess from Honeycomb speak with Lukas about database observability. In the podcast we explored the challenges of making it intuitive to understand why databases are slow, the open source library sqlcommenter, query plans, and insights on supporting application teams.
Lukas joined the Tech ONTAP Podcast to discuss the ins and outs of Postgres, how to improve performance, the impact of storage latency, and how pganalyze makes life better for data platform engineers.
SE Radio host Philip Winston speaks with Lukas about database indexing, queries, maintenance, scaling, and stored procedures. We also discussed some features of pganalyze, such as Index Advisor and VACUUM Advisor.
We take a close look at using EXPLAIN (ANALYZE, BUFFERS) to optimize Postgres queries, and what to know when interpreting the shared hits counter for Nested Loops.
We feature four new features in Postgres 16 that you might not know about yet. You’ll learn about pg_input_is_valid, pg_input_error_info, improvements to \watch, number separators in the SQL:2023 standard, and more!
We talk about understanding Postgres IOPS, why looking at I/O performance matters even when your data fits into the cache, and how Postgres structures its physical table structure - and how we can query the "ctid" field to find the actual location of a table page.
Subscribe on YouTube
We publish weekly videos on Postgres, our conference talks, tutorials, product walk-throughs, and more.
VACUUM Blocked by Xmin Horizon check: Fix an edge case where pganalyze was not correctly triggering an alert due to a problem holding back the Xmin Horizon for a very long time
Increase density of events shown on VACUUM Timeline for multi-day time selections
VACUUM Activity: Update lane label for manual VACUUMs in VACUUM Timeline for better distinction from autovacuums
OpenTelemetry integration
Optionally configure pganalyze to export EXPLAIN plans as OpenTelemetry tracing spans (requires pganalyze-collector 0.52.0)