Hacker Newsnew | past | comments | ask | show | jobs | submit | PerryStyle's commentslogin

I work in HPC and I’ve found it very useful in creating various shell scripts. It really helps if you have linters such as shellcheck.

Other areas of success have been just offloading the typing/prototyping. I know exactly how the code should look like so I rarely run into issues.


I would love to do this in the future, but knowing me I’d get caught up making sure I’m benchmarking properly then actually writing code.


I’d definitely recommend Miryoku for those starting out. You’re then free to make any modifications to suit your preferences.

I ended up making the layer activations happen on the same hand to allow 1 handed use.


Using this for my next build. Could you share more on how you did the activations for 1-handed used? That sounds quite interesting.


It's not super complex. I ended up just modifying the locations of the layer toggle keys. In the default Miryoku layout, in order to switch the keys to a different layer on the right hand you need to hold a button on the left hand. I found this to be annoying since some actions like entering and using a navigation layer can be done on 1 hand.


+1. Learned about this in DB research course during grad school. Feldera is really cool.

Also I love their website design.


Thanks for the kind words (Feldera co-founder here). I'll pass it on to the design team. :)


There are some solutions that try to tackle this in HPC. For example https://github.com/LLNL/mpibind is deployed on El Capitan.

Would be interesting to see if something similar appears for cloud workloads.


Do you have any good resources that go into detail on GPU ISAs or GPU architecture? There's certainly a lot available for CPUs, but the resources I’ve found for GPUs mostly focus on how they differ from CPUs and how their ISAs are tailored to the GPU's specific goals.


Unfortunately this is a topic that isn't open enough, and architectures change rather quickly so you're always chasing the rabbit. That being said:

RDNA architecture (a few gens old) slides has some breadcrumbs: https://gpuopen.com/download/RDNA_Architecture_public.pdf

AMD also publishes its ISAs, but I don't think you'll be able to extract much from a reference-style document: https://gpuopen.com/amd-gpu-architecture-programming-documen...

Books on CUDA/HIP also go into some detail of the underlying architecture. Some slides from NV:

https://gfxcourses.stanford.edu/cs149/fall21content/media/gp...

Edit: I should say that Apple also publishes decent stuff. See the link here and the stuff linked at the bottom of the page. But note that now you're in UMA/TBDR territory; discrete GPUs work considerably differently: https://developer.apple.com/videos/play/wwdc2020/10602/

If anyone has more suggestions, please share.


I assume most people learn microarchitecture for performance reasons.

At which point, the question you are really asking is what aspects of assembly are important for performance.

Answer: there are multiple GPU Matrix Multiplication examples covering channels (especially channel conflicts), load/store alignment, memory movement and more. That should cover the issue I talked about earlier.

Optimization guides help. I know it's 10+ years old, but I think AMDs OpenCL optimization guides was easy to read and follow, and still modern enough to cover most of today's architectures.

Beyond that, you'll have to see conferences about DirectX12 new instructions (wave instructions, ballot/voting, etc. etc) and their performance implications.

It's a mixed bag, everyone knows one or two ways of optimization but learning all of them requires lots of study.


Branch Education apparently decapped and scanned a GA102 (Nvidia 30 series) for the following video: https://www.youtube.com/watch?v=h9Z4oGN89MU. The beginning is very basic, but the content ramps up quickly.


Wow this one of the most interesting things I’ve come across. Definitely could learn a lot by tinkering with this.

Thanks!


Would it be possible to leverage the python array api standard? Or is that more suited for just computations?


Zotero's PDF viewer also does this now. Being able to annotate PDFs and having a reference manager has been a life saver.


Just out of curiosity, have you checked out Spack, https://github.com/spack/spack, which has a lot of HPC users. Support for mixing and matching both system and from source dependencies has been extremely useful in my work.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: