Marc Serra - CEO Mática Partners
- Nov 3, 2020
- 2 min read

The Data Engineers's Guide to Apache Spark and Delta Lake (by DATABRICKS)

For data engineers looking to leverage Apache Spark™’s and Delta Lake’s immense growth to build faster and more reliable data pipelines, Databricks is happy to provide The Data Engineer’s Guide to Apache Spark and Delta Lake. This eBook features excerpts from the larger Definitive Guide to Apache Spark and the Delta Lake Quick Start, by databricks

Preface:

Apache SparkTM has seen immense growth over the past several years, including its compatibility with Delta Lake. Delta Lake is an open-source storage layer that sits on top

of your existing data lake file storage, such as AWS S3, Azure Data Lake Storage, or HDFS. Delta Lake brings reliability, performance, and lifecycle management to data lakes.

Databricks is proud to share excerpts from the Delta Lake Quickstart and the book, Spark: The Definitive Guide.

Download this eBook to:

Walk through the core architecture of a cluster, Spark Application, and Spark’s Structured APIs using DataFrames and SQL.
Get a tour of Spark’s toolset that developers use for different tasks from graph analysis and machine learning to streaming and integrations with a host of libraries and databases. Cover working with different kinds of data including Booleans, Numbers, Strings, Dates and Timestamps, Handling Null, Complex Types, and User-Defined Functions.
Learn to get more reliable and higher quality data with Delta Lake, including loading, updating, and rolling back data in your data lake

Mini eBook - Apache Spark & Delta Lake-U

Download • 19.42MB

About Databricks

Databricks is the data and AI company. Thousands of organizations worldwide —

including Comcast, Condé Nast, Nationwide and H&M — rely on Databricks’ open and

unified platform for data engineering, machine learning and analytics. Databricks is

venture-backed and headquartered in San Francisco, with offices around the globe.

Founded by the original creators of Apache Spark™, Delta Lake and MLflow, Databricks is

on a mission to help data teams solve the world’s toughest problems.

The Data Engineers's Guide to Apache Spark and Delta Lake (by DATABRICKS)

Preface:

Recent Posts

In Mática we want and we can help you to improve your decision making process, thanks to the transformation and the interpretation of your business data using Big Data technologies and Artificial Intelligence.

Contact us and find out how we can help you

Mática Barcelona
Av de Roma 13 Esc B Entr 2º
08029 Barcelona
Spain (Europe)

we help to solve great impact and complex problems using data, technology and artificial intelligence in a responsible way.

Mática Madrid
Calle Martínez Izquierdo 45
28028 Madrid
Spain (Europe)

copyright © 2020 Mática Partners SL - All rights reserved

The Data Engineers's Guide to Apache Spark and Delta Lake (by DATABRICKS)

Preface:

Recent Posts

In Mática we want and we can help you to improve your decision making process, thanks to the transformation and the interpretation of your business data using Big Data technologies and Artificial Intelligence. Contact us and find out how we can help you

In Mática we want and we can help you to improve your decision making process, thanks to the transformation and the interpretation of your business data using Big Data technologies and Artificial Intelligence.

Contact us and find out how we can help you