Visions is a python library for working with user defined data type systems. Out of the box, it provides type inference and automated data cleaning of sequence data with backend specific implementations for pandas, spark, python, and numpy. We often use it as a first pass cleaning step when working with tabular data and to simplify the backend logic of both pandas-profiling[1] and our tabular data compression library compressio[2].
Because data types are user defined, we can build user customizable libraries based around types without adding code complexity. In the case of compressio that means offering users the ability to use any compression algorithm they want by simply passing a dictionary mapping `{type: compression algorithm}` or defining new compression algorithms for otherwise unsupported data types like shapely geometries, images, etc.
We hope you like it and would really appreciate feedback about how to make the library more useful and easier to use.
P.S. If you're interested in learning more about the project, the original paper is available on JOSS[3] you can also check out our Numpy Global 2020 talk[4]
1. https://github.com/pandas-profiling/pandas-profiling
2. https://github.com/dylan-profiler/compressio
3. https://joss.theoj.org/papers/10.21105/joss.02145
4. https://www.youtube.com/watch?v=h2w99XIKizY