Episodes

  • Foundation Models Unpacked: How Self-Supervised Learning Solved the AI Data Bottleneck
    Nov 2 2025

    Excerpts from the Stanford conferences and Yann LeCun's commentary offer an overview of the field of self-supervised learning (SSL), an emerging paradigm in artificial intelligence. The sources explain that SSL allows you to train large-scale deep learning models using untagged data, which addresses the limitation of the need for large-tagged data sets in traditional supervised learning. They discuss how SSL works by defining a pretext task where monitoring is automatically generated from input data, such as predicting missing parts of an image (as in Masked Autoencoders) or reordering patches (the Jigsaw puzzle). In addition, the concept of contrastive learning is presented, which trains models to generate similar representations for different views of the same object (positive pairs) and dissimilar representations for different objects (negative pairs). Once the model has been pre-trained with these tasks, its representations can be transferred to a later more specific task (such as classification or detection) with much less labeled data, using techniques such as fine-tuning or linear probing.

    Show More Show Less
    14 mins
  • 1 - 03 Generative Adversarial Networks: How GANs Work
    Oct 26 2025

    We offer an overview of Adversary Generative Networks (GAN), a type of machine learning algorithm that uses an adversarial learning framework with two submodules: a generator and a discriminator. The fundamental concept of GANs is explained with an analogy of a counterfeiter and the police, and generative modeling is deepened, highlighting the problem of intractable normalization constants and how GANs address it. It also examines the loss function used to train GANs, its relationship with zero-sum or minimax play, and common training problems, such as mode collapse and gradients that fade. In addition, the adversarial nature of GANs is described and their uses in image generation, video frame prediction and image improvement are highlighted.

    Show More Show Less
    17 mins
  • 1 - 02 How Retrieval Augmented Generation Fixed LLM Hallucinations
    Oct 19 2025

    The source material, an excerpt from a transcript of the IBM Technology video titled "What is Retrieval-Augmented Generation (RAG)?," explains a framework designed to enhance the accuracy and timeliness of large language models (LLMs). Marina Danilevsky, a research scientist at IBM Research, describes how LLMs often face challenges such as providing outdated information or lacking sources for their responses, which can lead to incorrect answers or hallucinations. The RAG framework addresses these issues by integrating a content repository that the LLM accesses first to retrieve relevant information in response to a user query. This retrieval-augmented process ensures that the model generates responses based on up-to-date data and can provide evidence to support its claims.

    Show More Show Less
    17 mins
  • 1 - 03 Word Embeddings Explained
    Oct 11 2025

    An overview of word embeddings, explaining that they are numerical representations of words—often in the form of vectors—that capture their semantic and contextual relationships. The need to transform raw text into numbers arises from the inability of most machine learning algorithms to process plain text, making word embeddings a fundamental tool in natural language processing (NLP). The video describes various applications of embeddings, including text classification and named entity recognition (NER), as well as the process of creating them through models trained on large text corpora. Finally, the text contrasts the two main approaches: frequency-based embeddings (such as TF-IDF) and prediction-based embeddings (such as Word2vec and GloVe), concluding with the advancement toward contextual embeddings offered by Transformer models.

    Show More Show Less
    17 mins