Intro to Apache DataFusion: Technology, Community, and Not Quite Enough Time

Intro to Apache DataFusion: Technology, Community, and Not Quite Enough Time

297 Lượt nghe
Intro to Apache DataFusion: Technology, Community, and Not Quite Enough Time
Andrew Lamb, Apache DataFusion PMC chair introduces the project, an Extensible Query Engine, compares it to LLVM and then describes how it is used to build Custom Databases, File Formats, Table Formats, and SQL Analysis. I then explain briefly what the Apache Software Foundation governance structure, and how it benefits DataFusion, and ask people to come and join the community I have given versions of this talk at many of the meetups I have attended in person but never seem to have been able to record. Links: https://datafusion.apache.org/ https://github.com/apache/datafusion https://datafusion.apache.org/contributor-guide/communication.html Slides: https://docs.google.com/presentation/d/1p5e07qrN-R8Nuyb6INyY4YM50QESpgmnSdGrLViRcfk AI Summary: Andrew Lamb, a staff engineer at InfluxData and Apache Data Fusion PMC chair, delivered a recorded presentation introducing DataFusion technology and emphasizing the need for community engagement. He explained how DataFusion serves as a reusable framework for building high-performance analytics systems, comparing it to LLVM and highlighting its core functionalities and performance capabilities. Andrew discussed various use cases for data fusion and presented an overview of Apache Data Fusion's governance structure, encouraging community involvement through different contribution channels.