Building a Data Infrastructure for AI/ML – Keith Pijanowski, MinIO
The Open Table Formats (OTFs) designed by Netflix (Apache Iceberg), Uber (Aache Hudi), and Databricks (Delta Lake) have made it possible to build a cloud-native data infrastructure capable of supporting all the requirements of AI/ML. Such a data infrastructure can hold all data needed for all model types and scale out as capacity requirements change. This session will present a reference architecture for building an AI/ML data infrastructure and show how it supports MLOps, distributed training, and advanced data manipulating techniques made possible by OTF-based data storage. Additionally, this talk will show how such a data platform can support the special tooling needed for Generative AI.