Fivetran, a global leader in automated data movement, today announced support for Amazon Simple Storage Service (Amazon S3) with Apache Iceberg data lake format. Amazon S3 is an object storage service from Amazon Web Services (AWS) that offers industry-leading scalability, data availability, security and performance. Apache Iceberg is a widely supported open-source data format that offers atomic, consistent, isolated and durable (ACID) transactions for data lakes. Fivetran is the automated data movement platform, anonymizing personally identifiable information (PII) while cleansing, normalizing and automatically loading data into the lake.
With expansive storage capacity and support for multiple data formats, the data lake is a popular destination for teams doing analysis on massive data sets or running extensive data science projects that fuel their business. Hundreds of thousands of data lakes run on top of Amazon S3 and, of the many enterprise teams that have already put them to work, a majority cite enhanced business agility, improvement in developing products and services, and enhancing customer service and engagement as benefits of data lakes.
“Fivetran supporting Amazon S3 as a destination is a big deal for our platform Distilled, and anyone building external data and analytics products,” said Aaron Peabody, Co-Founder and CEO at Untitled Firm. “This new destination allows our customers to tap into the full potential of AWS’s services. We couldn’t be more excited that Fivetran has invested in this destination as it is a force multiplier catalyst for our own product roadmap at Untitled.”
“We now automatically extract, cleanse, deduplicate, and make ready for analysis large volumes of semi-structured data to power data lakes in the same reliable and secure way our customers get their data into their cloud warehouses today,” said Fraser Harris, Vice President of Product at Fivetran. “Fivetran and AWS share a vision that without structure, governance and accuracy of data in a data lake, organizations are unnecessarily increasing complexity and not realizing the full value of the data they store there. Fivetran’s mission is to make access to data as simple and reliable as electricity, and this new support brings that promise to the world of data lakes.”
”We are delighted that the accessibility of Amazon S3 with Iceberg continues to grow,” said Greg Khairallah, Director of Analytics at AWS. “It’s an easy way for our customers to simplify data ingestion while providing customers the scalability of a data lake and the reliable data transformation of a data warehouse.”
As organizations continue to leverage data lakes to run analytics and extract insights from their data, progressive marketing intelligence teams are demanding more of them, and solutions like Amazon S3 and automated pipeline support are meeting that demand.
Tinuiti, one of the largest independent performance marketing firms, handles large volumes of data on a daily basis and must have a data lake — Amazon S3 in particular — to power their customers’ brand potential.
“The data lake is an easy, affordable, secure and robust way to store all our customers’ data,” said Lakshmi Ramesh, Vice President, Data Services at Tinuiti. “The main challenge is in optimizing performance and accessibility, but with Fivetran’s support for Amazon S3 with Iceberg it will further optimize our Fivetran pipeline. Since the data lake is our single source of truth, it is critical that all the data ingested from different sources be accessible in the data lake.”
Instead of focusing on all the manual steps required to ingest data, cleanse it, prepare it for usage, hash and block sensitive data, and then start querying it, modern organizations see great value in reducing data lake management efforts through pipeline automation and governance.
“Fivetran’s support for Amazon S3 and its standardization on Iceberg format makes it easier than ever for organizations to get their data into a lakehouse,” said Tomer Shiran, co-founder and CPO, Dremio. “With Fivetran, AWS and Dremio, organizations can build their open data lakehouse architecture for users to quickly access and query data and provide critical data-driven business insights.”