• Home
  • Fresh Content
  • Courses
  • Resources
  • Podcast
  • Talks
  • Publications
  • Sponsorship
  • Testimonials
  • Contact
  • Menu

Jon Krohn

  • Home
  • Fresh Content
  • Courses
  • Resources
  • Podcast
  • Talks
  • Publications
  • Sponsorship
  • Testimonials
  • Contact
Jon Krohn

Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz

Added on August 15, 2025 by Jon Krohn.

Today, Oz Katz joins me to explain what data lakes are as well as what data infrastructure's needed to train and run modern A.I. models (such as multi-modal LLMs). This is an informative one!

Oz Katz:

  • Is co-founder and CTO of lakeFS, which looks and feels like Git but is for data (instead of software) versioning and collaboration.

  • Was previously co-founder/CTO at a number of other tech startups, including Swayy, which was acquired by Similarweb in 2015.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, YouTube, SuperDataScience, Podcast, Interview Tags superdatascience, ai, data, datalake, datawarehouse, git, versioning
← Newer: How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi Older: LLM Pre-Training and Post-Training 101, with Julien Launay →
Back to Top