Niels Bantilan - Pandera: Beyond Pandas Data Validation | SciPy 2023
Data quality remains a core concern for practitioners of machine learning, data science, and data engineering, and in recent years specialized packages have emerged to validate and monitor data and models. However, as the open source community iterates on data frameworks – notably, highly performant entrants such as Polars – data quality libraries need to catch up to support them. In this talk, you will learn about Pandera and its journey from being a pandas-only validator to a generic tool for testing arbitrary data containers so that it can provide a standardized way of creating data validation tools.
----
SciPy is a community dedicated to the advancement of scientific computing through open source Python software for mathematics, science, and engineering. The annual SciPy Conference allows participants from all types of organizations to showcase their latest projects, learn from skilled users and developers, and collaborate on code development. SciPy is organized by NumFOCUS, a 501(c)(3) nonprofit organization that promotes open source scientific computing.