Petabyte-scale log analytics with elasticsearch
Andrew Montalenti
Chief Product Officer
Despite Elasticsearch's popularity as a distributed search engine and as a core plank of the Elastic stack (utilized together with projects like Kibana and Logstash), Elasticsearch can also be thought of as a powerful columnar data store and time series analytics engine. This makes Elasticsearch usable in contexts where you might normally be thinking of using systems like Google BigQuery, Amazon Athena, Snowflake, or, in open source, Dremel, Presto, Druid, or Spark. (You might even have considered tools like TimescaleDB, InfluxDB, or Prometheus in this category.) But Elasticsearch has some important advantages as a time series log analytics engine, especially when you need your ES cluster to power live concurrent queries from real users. Using Elasticsearch in this way, however, requires some lateral thinking on how to index, store, and query your data -- especially with regard to pre-aggregation. In this 20-minute talk, we will discuss how a small engineering team built out a large-scale time series analytics engine atop Elasticsearch, running in production atop AWS EC2, starting with a small cluster powered by Elasticsearch 1.3 (2014), when "aggregations" first became stable at scale, all the way through a much larger cluster powered by Elasticsearch 6.8 (2019-2020), as our system crossed over into over a petabyte of log data stored. We'll also discuss our home-grown query layer for Elasticsearch, which bridges the gap between the ES query DSL and time-series-aware SQL. Finally, we will discuss the open source work the team has done to prototype "index-stored aggregations" in Elasticsearch, with a focus on the challenging cardinality aggregation (aka approximate distinct count), a place where we see the potential for massive cost savings and query performance improvement, with just a little open source work. We'll close with a discussion of the broader open source effort for index-stored data aggregations and data sketches in ES, which might lead to even more innovation, and make it yet more competitive relative to other columnar time series storage options.
Interested in Tooling?
Visit our Tooling community!
We are using more and more tools every day. Here we discuss new and all tools every CTO or engineering leader should be aware of, we share feedback and best practices and help each other to use tools more efficiently. Currently, our main topics are Project management, CI/CD, Feature flagging, Security, Incident Response, Reliability/chaos engineering, monitoring/observability, low code/no-code/Serverless, Hosting.
James Smith, CEO and Co-Founder at Bugsnag
Poorna Udupi, CTO at Good Money
Eleanor Saitta, Principal at Systems Structure
Joni Klippert, CEO at StackHawk
Beverly May, VP UX & Product at GE Healthcare
Cat Posey, Senior Technology Director, ML at Capital One
Nick Rockwell, SVP Engineering at Fastly
Chase Douglas, CTO at Stackery
Jeremy Garcia, Director of Technical Community & Open Source at Datadog
Eleanor Saitta, Founder at Systems Structure
Scott Gerlach, Cofounder & CSO at StackHawk
Aaron Bedra, Senior Software Engineer at DRW

Copyright © 2024 CTO Connection, All Rights Reserved