NLP Demystified 9: Automatically Finding Topics in Documents with Latent Dirichlet Allocation

NLP Demystified 9: Automatically Finding Topics in Documents with Latent Dirichlet Allocation

11.888 Lượt nghe
NLP Demystified 9: Automatically Finding Topics in Documents with Latent Dirichlet Allocation
Course playlist: https://www.youtube.com/playlist?list=PLw3N0OFSAYSEC_XokEcX8uzJmEZSoNGuS What do you do when you need to make sense of a pile of documents and have no other information? In this video, we'll learn one approach to this problem using Latent Dirichlet Allocation. We'll cover how it works, then build a model with spaCy and Gensim to automatically discover topics present in a document and to search for similar documents. Colab notebook: https://colab.research.google.com/github/futuremojo/nlp-demystified/blob/main/notebooks/nlpdemystified_topic_modelling_lda.ipynb Timestamps 00:00:00 Topic modelling with LDA 00:00:21 The two assumptions an LDA topic model makes 00:03:15 Building an LDA Machine to generate documents 00:10:16 The Dirichlet distribution 00:14:43 Further enhancements to the LDA machine 00:17:01 LDA as generative model 00:20:15 Training an LDA model using Collapsed Gibbs Sampling 00:28:44 DEMO: Discovering topics in a news corpus and searching for similar documents 00:45:24 Topic model use cases and other models This video is part of Natural Language Processing Demystified --a free, accessible course on NLP. Visit https://www.nlpdemystified.org/ to learn more.