Power Query - Extract PDF Tables by the Table's Content

Power Query - Extract PDF Tables by the Table's Content

3.083 Lượt nghe
Power Query - Extract PDF Tables by the Table's Content
Learn how to extract tables from PDF files based on the content of the tables. This technique is NOT reliant on table names or page locations. PLUS, many cool tricks for dealing with data discovery and manipulation exist. File Download Link: https://www.bcti.com//wp-content/YT_Downloads/BCTI_ExtractPDF_Content.zip 00:03 Overview of Problem 01:10 File Download Instructions 01:20 Main Issues (Project Overview) 02:08 Connecting to a Folder (Controlling Scop 03:13 Filter for PDF Files 03:43 Remembering Where Records Came From 04:54 Extracting PDF Metadata 05:38 Extracting Tables from PDF Files 06:11 Discovering Needed Tables 06:50 Converting Tables to Rows 07:12 Combining Nested Lists into a Single List 07:31 Searching for the Keyword that Identifies Needed Tables 08:30 Expanding Table Contents 08:43 Promoting the Header Row 08:49 Removing Unwanted Columns 08:59 Unpivoting the Stacked Tables 09:13 Creating Proper Dates 09:36 Removing Unwanted Rows (Errors) 09:59 Loading the Results to Excel 10:07 Building a Report 10:44 Testing for New Data 11:15 Issues with Hardcoded Titles 12:55 Renaming Columns by Position 14:02 Updating the Remaining Code 14:34 Testing the Dynamic Column Name Feature 14:54 Project Conclusion