More

kengoa · 2024-11-10T15:16:28 1731251788

I will try to add more support for other data formats like xlsx and parquet in the future. Current CSV parsing is also a bit limited (i.e. cannot deal with timestamps) so I will try to update parsers first.

Also thanks for the error checker! I pushed the fixes in https://github.com/visprex/visprex.github.io/pull/4

kengoa · 2024-11-10T11:35:04 1731238504

Thanks for commenting this, I would say speeding up the iteration between visualisation steps is the main benefit as you might not want to be thinking about matplotlib syntax when trying to get a sense of data distributions.

kengoa · 2024-11-10T11:30:24 1731238224

> had to rely on tools like Google Sheets or Colab + Pandas for quick cleaning and wrangling before exploring different visualizations.

Yes I had the same experience for analytics work some years ago. As others have pointed out, Visprex only works in a happy path where data is a clean CSV file so will definitely need to work on data cleaning. I have a DuckDB integration planned but not sure if this is easy enough for the target audience. Will try to add some predefinied functionalities, thanks for the feedback!

kengoa · 2024-11-10T11:25:52 1731237952

I'm very glad to hear this as this is exactly the target audience and the use case I initially thought of! I hope your students find it useful.

kengoa · 2024-11-10T11:24:12 1731237852

Yes I do have plans for data preprocessing using DuckDB WebAssembly (I have upcoming features secion in this blog: https://kengoa.github.io/software/2024/11/03/small-software....) but this will require SQL which some of the target audience might not be familiar with. I'm thinking of something like visual query builder from metabase.

> With the UI I want to be able to toggle between different strategies quickly - strip characters from a column to treat it as numeric, if less than 2% or 5% of values have a character, fill na with mean, interpret dates in different formats - drop if the date doesn't parse

Those are really good examples and I can make those predefined preproccesing techniques available as toggles in the dataset tab. Thanks for the feedback!

kengoa · 2024-11-10T11:16:47 1731237407

I used d3.js for Visprex and some of those graphs are modified from examples in https://d3-graph-gallery.com/

kengoa · 2024-11-10T09:57:44 1731232664

> how this compares to a spreadsheet CSV import tool such as the one in Excel which is extremely flexible.

I would say the data loading functionality compares very poorly to Excel CSV import for all the reasons you pointed out, and I agree that the users can face those formatting issues which could be resolved in another tool like Excel or Google Spreadsheet for non-technical users and Notepad++ or editors for a bit more technical users. The assumption on CSV files being clean is strong so I will try to surface import errors at least, and in the meantime point to different ways to format the data as those tools will be complementary to Visprex.

> To me, this implies that no steps have been taken to manage user/data privacy.

This is a good point. I fixed the wording and now it simply reads "No tracking or analytics software is used". Thanks!

kengoa · 2024-11-09T23:02:18 1731193338

I haven't considered beeswarm charts for this before, I will add those to a list of upcoming features. Thanks for the feedback :)

kengoa · 2024-11-09T22:57:48 1731193068

Thanks for trying it out! This is unfortunately not possible as of now and is one of th high-priority tasks to parse timestamps and datetimes, which is now incorrectly parsed as a string (Categorical). I'm using Papa Parse to load CSV data and I will likely need to add a custom parser on top of it.

Some of those plans are mentioned in my blog post reflecting on building this app: https://kengoa.github.io/software/2024/11/03/small-software....

nerdponx · 2024-11-09T23:33:23 1731195203

You might also want to support a Unix timestamp as input, i.e. an integer or decimal number of (mili|micro|nano-)seconds since the Unix epoch. No need to worry about messy date parsing there.

nyclounge · 2024-11-10T00:19:09 1731197949

Maybe use dayjs to handle all kinds of wired string dates.

kengoa · 2024-11-10T15:26:11 1731252371

dayjs seems like exactly I was looking for, thanks for the suggestion! I might have tried to write a parser myself otherwise.

kengoa · 2024-11-09T19:49:48 1731181788

> I'm fairly certain I am able to eventually cover 100% of JavaScript spec. Any ideas, questions or critique welcomed!

Do you have the results of test262_runner.rb? I came to know about test262 at a talk by the porffor's author and something like https://github.com/CanadaHonk/porffor?tab=readme-ov-file#tes... in README would be great to show this progress. Great project by the way!

drogus · 2024-11-09T20:40:14 1731184814

Yeah, at the moment it's passing about 12% of tests, but there is a lot of low hanging fruits to implement, especially considering I started the project two weeks ago. This doesn't mean I will hit 100% in linear time, unfortunately, cause there is a long tail of builtin types and functions, but as I started by implementing the "hard parts" I left some easy parts not done. For example I implemented only enough syntax to allow running conditionals and a while loop cause it was needed for the test262 harness, but I left all the other loops (for, for in, for of, do while) and conditional expressions (switch) unimplemented. Implementing them will be more or less analogous to existing implementations for if/else and while.

Once I finish implementing `await` and generators, which are the last hard to implement semantic concepts, I will be implementing those low hanging fruits. It's hard to say how much coverage that will give me, but just to give an example: currently 1200 tests fails, cause `object["foo"]` syntax is not implemented. Ie. `object.foo` works, but `object["foo"]` does not. It doesn't mean that those 1200 tests automatically will pass, cause they might be testing other stuff, but there is a lot of such relatively simple syntax ommissions that make hundreds of tests fail.

And yes, I would love to have a nice graph like porffor has! :D

kengoa · 2024-11-10T15:33:27 1731252807

Yes now I'm looking at the tests I can see that the coverage won't increase linearly, but 12% (with some of the hard parts) in two weeks is impressive. I starred the repo, good luck with your journey :)