Data

Topic Tags

#analysis #intelligence #lake #sanitation #visualization #warehouse #database #etl #elt #dbt #storytelling #tableau #databricks #snowflake

Key Questions

What is data science, data analytics, data exploitation?

What is a database, data warehouse, data lake?

Why can't we just use Excel for everything?

How is data produced, where is it produced, how is it ingested?

What is end-to-end data observability and monitoring?

What is Extract, Transform, Load (ETL)? What is traditional vs modern ETL?

What is Extract, Load, Transform (ELT)? How does it differ from ETL?

What is a Data Build Tool?

How mature is the organization in the collection, manipulation, exploitation of data?

Where are the data silos in our organization and how did they come to be?

What are our business needs for data (e.g. latency, scale, security)?

What does the data tell us about our current business performance?

How can we improve our customer experience based on the data?

How can we design and implement a scalable data pipeline to ingest and process large volumes of structured and unstructured data from multiple sources?

What is a typical design of a cloud-native stack to derive business intelligence?

What is the difference between 'Official Sensitive' data and open data in a cloud context?

Correlation vs. Causation: How to not lie with statistics.

Learning Objectives

Track your progress as you learn

Hard Truths

Reality Check

Stop creating a new data lake to succeed previous data lakes and warehouses!.

Reality Check

The world's most popular database is Microsoft Excel. Any data strategy that ignores this reality will fail to get adopted.

Reality Check

Dashboards (Tableau & Power BI) are useless if they are just pretty pictures of bad data. Garbage In, Garbage Out.

Reality Check

Business users don't care about your 'Data Lakehouse architecture'; they just want to know why the numbers on page 5 don't match the numbers on page 2.