Data
Key Questions
What is data science, data analytics, data exploitation?
What is a database, data warehouse, data lake?
Why can't we just use Excel for everything?
How is data produced, where is it produced, how is it ingested?
What is end-to-end data observability and monitoring?
What is Extract, Transform, Load (ETL)? What is traditional vs modern ETL?
What is Extract, Load, Transform (ELT)? How does it differ from ETL?
What is a Data Build Tool?
How mature is the organization in the collection, manipulation, exploitation of data?
Where are the data silos in our organization and how did they come to be?
What are our business needs for data (e.g. latency, scale, security)?
What does the data tell us about our current business performance?
How can we improve our customer experience based on the data?
How can we design and implement a scalable data pipeline to ingest and process large volumes of structured and unstructured data from multiple sources?
What is a typical design of a cloud-native stack to derive business intelligence?
What is the difference between 'Official Sensitive' data and open data in a cloud context?
Correlation vs. Causation: How to not lie with statistics.
Learning Objectives
Learning Objectives
Track your progress as you learn
Hard Truths
Stop creating a new data lake to succeed previous data lakes and warehouses!.
The world's most popular database is Microsoft Excel. Any data strategy that ignores this reality will fail to get adopted.
Dashboards (Tableau & Power BI) are useless if they are just pretty pictures of bad data. Garbage In, Garbage Out.
Business users don't care about your 'Data Lakehouse architecture'; they just want to know why the numbers on page 5 don't match the numbers on page 2.