Loading…
DevConf.US 2021 has ended
Registration is now OPEN! Please register on hopin as soon as possible!

DevConf.US 2021 is the 4th annual, free, Red Hat sponsored technology conference for community project and professional contributors to Free and Open Source technologies coming to a web browser near you!
Back To Schedule
Thursday, September 2 • 15:30 - 16:00
Fix your data bottlenecks with Trino, Hive, and S3

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!



Relational databases are wonderful tools, and they are more than capable of handling many workloads. But one dark day the data stopped flowing. As our customer base grew, so did the data volume and we couldn’t keep up. As we watched our queue grow, data processing put an undue burden on our database, preventing the API from serving requests in a timely manner. Enter Trino to save the day. This talk will discuss how and why our database was bottlenecked and how we leveraged object storage (AWS S3), Hive, and Trino to get our data processing pipeline back on track and scalable for future growth. We’ll look at the properties of our data and how its particular characteristics strained our system. Next, we’ll see how using object storage and the parquet data format enable low cost long term data storage and efficient data access for analytics workloads. Finally, we’ll explore how Trino with Hive enables fast, scalable data analysis using the SQL you already know. And rest assured, there’s still plenty of work left for a relational database. Leave with an overview of a big data toolchain, and how one team is making use of it, for big and not so big datasets, and learn when it may be time to switch data processing architectures to prevent show stopping bottlenecks.

Speakers

Thursday September 2, 2021 15:30 - 16:00 EDT
Virtual