Petabyte-scale log analytics with elasticsearch
Andrew Montalenti
Chief Product Officer
Parse.ly
Despite Elasticsearch's popularity as a distributed search engine and as a core plank of the Elastic stack (utilized together with projects like Kibana and Logstash), Elasticsearch can also be thought of as a powerful columnar data store and time series analytics engine. This makes Elasticsearch usable in contexts where you might normally be thinking of using systems like Google BigQuery, Amazon Athena, Snowflake, or, in open source, Dremel, Presto, Druid, or Spark. (You might even have considered tools like TimescaleDB, InfluxDB, or Prometheus in this category.) But Elasticsearch has some important advantages as a time series log analytics engine, especially when you need your ES cluster to power live concurrent queries from real users. Using Elasticsearch in this way, however, requires some lateral thinking on how to index, store, and query your data -- especially with regard to pre-aggregation. In this 20-minute talk, we will discuss how a small engineering team built out a large-scale time series analytics engine atop Elasticsearch, running in production atop AWS EC2, starting with a small cluster powered by Elasticsearch 1.3 (2014), when "aggregations" first became stable at scale, all the way through a much larger cluster powered by Elasticsearch 6.8 (2019-2020), as our system crossed over into over a petabyte of log data stored. We'll also discuss our home-grown query layer for Elasticsearch, which bridges the gap between the ES query DSL and time-series-aware SQL. Finally, we will discuss the open source work the team has done to prototype "index-stored aggregations" in Elasticsearch, with a focus on the challenging cardinality aggregation (aka approximate distinct count), a place where we see the potential for massive cost savings and query performance improvement, with just a little open source work. We'll close with a discussion of the broader open source effort for index-stored data aggregations and data sketches in ES, which might lead to even more innovation, and make it yet more competitive relative to other columnar time series storage options.
Interested in Tooling?
Visit our Tooling community!
We are using more and more tools every day. Here we discuss new and all tools every CTO or engineering leader should be aware of, we share feedback and best practices and help each other to use tools more efficiently. Currently, our main topics are Project management, CI/CD, Feature flagging, Security, Incident Response, Reliability/chaos engineering, monitoring/observability, low code/no-code/Serverless, Hosting.
VIDEOS RELATED TO TOOLING
Jason Mongue, Founder & Tech Diligence Principal at The Clover Group
Charity Majors, CTO at Honeycomb
Barak Schoster, Senior Director, Chief Architect at Palo Alto Networks
Eleanor Saitta, Principal Consultant at Systems Structure
John Kinsella, Co-founder & CTO at Cysense
Leo Zhadanovsky, Chief Technologist, Education, Worldwide Public Sector at Amazon Web Services
Liz Fong-jones, Principal Developer Advocate at Honeycomb.io
Heidi Waterhouse, Principal Developer Advocate at LaunchDarkly
Johanna Rothman, Owner at Rothman Consulting Group
Andrew Fong, CTO at Vise

Copyright © 2024 CTO Connection, All Rights Reserved