Over the past decade, we’ve witnessed increased fragmentation within the Python data ecosystem. This fragmentation largely stems from the increased popularity of data science, numerical computation, and deep learning and the proliferation of new libraries intended to serve those needs. While the growth of new libraries and frameworks has contributed to significant innovation within the ecosystem, the resulting fragmentation has a cost, as users and downstream library maintainers cannot readily interoperate among the various libraries and must frequently develop programs which only target a single library. The Python Data APIs consortium aims to address this problem by standardizing the fundamental data structures of arrays and dataframes and an associated set of common APIs for working with those data structures, thus facilitating interchange and interoperation. During 2021, we were able to achieve the following objectives:
Define a standardization methodology. Develop the tooling necessary to support the standardization methodology. Publish an array API standard RFC. Publish a dataframe interchange protocol RFC. Finalize 2021.0x API standards after community review.
For more information, consult the formal text of the respective specifications:
Array API: https://data-apis.org/array-api/latest/ Dataframe interchange protocol: https://data-apis.org/dataframe-protocol/latest/index.html
Stephannie Jimenez Gacha
My name is Stephannie Jimenez and I'm a software developer currently working at Quansight. I enjoy working in open source projects and have a soft spot for pets.
visit the speaker at: Github