Hacker News new | past | comments | ask | show | jobs | submit | kengoa's comments login

I will try to add more support for other data formats like xlsx and parquet in the future. Current CSV parsing is also a bit limited (i.e. cannot deal with timestamps) so I will try to update parsers first.

Also thanks for the error checker! I pushed the fixes in https://github.com/visprex/visprex.github.io/pull/4


Thanks for commenting this, I would say speeding up the iteration between visualisation steps is the main benefit as you might not want to be thinking about matplotlib syntax when trying to get a sense of data distributions.


> had to rely on tools like Google Sheets or Colab + Pandas for quick cleaning and wrangling before exploring different visualizations.

Yes I had the same experience for analytics work some years ago. As others have pointed out, Visprex only works in a happy path where data is a clean CSV file so will definitely need to work on data cleaning. I have a DuckDB integration planned but not sure if this is easy enough for the target audience. Will try to add some predefinied functionalities, thanks for the feedback!


I'm very glad to hear this as this is exactly the target audience and the use case I initially thought of! I hope your students find it useful.


Yes I do have plans for data preprocessing using DuckDB WebAssembly (I have upcoming features secion in this blog: https://kengoa.github.io/software/2024/11/03/small-software....) but this will require SQL which some of the target audience might not be familiar with. I'm thinking of something like visual query builder from metabase.

> With the UI I want to be able to toggle between different strategies quickly - strip characters from a column to treat it as numeric, if less than 2% or 5% of values have a character, fill na with mean, interpret dates in different formats - drop if the date doesn't parse

Those are really good examples and I can make those predefined preproccesing techniques available as toggles in the dataset tab. Thanks for the feedback!


I used d3.js for Visprex and some of those graphs are modified from examples in https://d3-graph-gallery.com/


> how this compares to a spreadsheet CSV import tool such as the one in Excel which is extremely flexible.

I would say the data loading functionality compares very poorly to Excel CSV import for all the reasons you pointed out, and I agree that the users can face those formatting issues which could be resolved in another tool like Excel or Google Spreadsheet for non-technical users and Notepad++ or editors for a bit more technical users. The assumption on CSV files being clean is strong so I will try to surface import errors at least, and in the meantime point to different ways to format the data as those tools will be complementary to Visprex.

> To me, this implies that no steps have been taken to manage user/data privacy.

This is a good point. I fixed the wording and now it simply reads "No tracking or analytics software is used". Thanks!


I haven't considered beeswarm charts for this before, I will add those to a list of upcoming features. Thanks for the feedback :)


Thanks for trying it out! This is unfortunately not possible as of now and is one of th high-priority tasks to parse timestamps and datetimes, which is now incorrectly parsed as a string (Categorical). I'm using Papa Parse to load CSV data and I will likely need to add a custom parser on top of it.

Some of those plans are mentioned in my blog post reflecting on building this app: https://kengoa.github.io/software/2024/11/03/small-software....


You might also want to support a Unix timestamp as input, i.e. an integer or decimal number of (mili|micro|nano-)seconds since the Unix epoch. No need to worry about messy date parsing there.


Maybe use dayjs to handle all kinds of wired string dates.


dayjs seems like exactly I was looking for, thanks for the suggestion! I might have tried to write a parser myself otherwise.


> I'm fairly certain I am able to eventually cover 100% of JavaScript spec. Any ideas, questions or critique welcomed!

Do you have the results of test262_runner.rb? I came to know about test262 at a talk by the porffor's author and something like https://github.com/CanadaHonk/porffor?tab=readme-ov-file#tes... in README would be great to show this progress. Great project by the way!


Yeah, at the moment it's passing about 12% of tests, but there is a lot of low hanging fruits to implement, especially considering I started the project two weeks ago. This doesn't mean I will hit 100% in linear time, unfortunately, cause there is a long tail of builtin types and functions, but as I started by implementing the "hard parts" I left some easy parts not done. For example I implemented only enough syntax to allow running conditionals and a while loop cause it was needed for the test262 harness, but I left all the other loops (for, for in, for of, do while) and conditional expressions (switch) unimplemented. Implementing them will be more or less analogous to existing implementations for if/else and while.

Once I finish implementing `await` and generators, which are the last hard to implement semantic concepts, I will be implementing those low hanging fruits. It's hard to say how much coverage that will give me, but just to give an example: currently 1200 tests fails, cause `object["foo"]` syntax is not implemented. Ie. `object.foo` works, but `object["foo"]` does not. It doesn't mean that those 1200 tests automatically will pass, cause they might be testing other stuff, but there is a lot of such relatively simple syntax ommissions that make hundreds of tests fail.

And yes, I would love to have a nice graph like porffor has! :D


Yes now I'm looking at the tests I can see that the coverage won't increase linearly, but 12% (with some of the hard parts) in two weeks is impressive. I starred the repo, good luck with your journey :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: