—/100
Checkpoints
Query the NYC collision data
/ 30
Query the most popular bike route by gender
/ 30
Creating datacatalog template and tag
/ 40
Exploring Dataset Metadata Between Projects with Data Catalog
GSP789
Overview
Data Catalog is a fully managed, scalable metadata management service in Google Cloud's Data Analytics family of products.
Managing data assets can be time consuming and expensive without the right tools. Data Catalog provides a centralized place where organizations can find, curate and describe their data assets.
Using Data Catalog
There are two main ways you interact with Data Catalog:
-
Searching for data assets that you have access to
-
Tagging assets with metadata
What you will learn
In this lab, you will learn how to:
-
Explore a simulated enterprise environment of 2 projects, 2 datasets, and 2 user accounts.
-
Navigate through a BigQuery table manually in the UI.
-
Run queries to better understand sensitive data columns that we want to tag later.
-
Use Data Catalog to search for existing datasets across projects.
-
Use Data Catalog tag templates to tag assets with rich metadata.
Why is this useful?
-
View data assets across multiple projects in your organization.
-
Create re-usable tag templates to add rich data descriptions for your teams.
-
Quickly highlight which datasets have PII (Personally Identifiable Information).
-
Metadata Access control is inherited based on logged in user (no separate Data Catalog ACLs needed).
Prerequisites
Very Important: Before starting this lab, log out of your personal or corporate gmail account, or run this lab in Incognito. This prevents sign-in confusion while the lab is running.
Join Qwiklabs to read the rest of this lab...and more!
- Get temporary access to the cloud console.
- Over 200 labs from beginner to advanced levels.
- Bite-sized so you can learn at your own pace.