Spacy is much faster on the GPU. Many folks don't know that Cudf (a Pandas implementation for GPUs) parallelizes string operations (these are notoriously slow on Pandas)... shrug...
Apache Ballista and Polars do Apache Arrow and SIMD.
The Polars homepage links to the
"Database-like ops benchmark" of
{Polars,
data.table,
DataFrames.jl,
ClickHouse,
cuDF*,
spark,
(py)datatable,
dplyr,
pandas,
dask,
Arrow,
DuckDB,
Modin,} but not yet PostgresML?
https://h2oai.github.io/db-benchmark/