Codelab → https://goo.gle/4nInEzb
GitHub → https://goo.gle/3W6TKJc
Blog Post → https://goo.gle/46GkqFK
Manual data quality rule creation is often timecconsuming and inconsistent. In this video, we’ll demonstrate a practical, programmatic workflow on Google Cloud to automate data quality rule generation and deployment using Dataplex and generative AI. Reduce manual effort and build more reliable data pipelines. This session provides a robust pattern for building scalable data quality systems, enabling you to manage your data governance policies as code.
Here’s what we’ll cover:
1️⃣ Flattening complex data: See how to use Materialized Views in BigQuery to prepare nested data for comprehensive Dataplex profiling.
2️⃣ Automated profiling: Learn to programmatically trigger and manage Dataplex data profile scans using the Python client.
3️⃣ AI powered rule suggestion: Discover how to export profile results and leverage the Gemini CLI to intelligently suggest Dataplex compliant data quality rules in YAML format.
4️⃣ Human-in-the-Loop validation: Understand the critical importance of human review to validate and refine AI generated configurations before deployment.
5️⃣ Deploying quality scans: We walk through deploying these validated rules as automated Dataplex data quality scans.
🔔 Subscribe to Google Cloud Tech → https://goo.gle/GoogleCloudTech
Speaker: Hyunuk Lim
Products Mentioned: Dataplex Universal Catalog, BigQuery,