domschl's comments

domschl · 2025-09-21T06:45:41 1758437141

Try "Lobhudelei"

peepee1982 · 2025-09-21T07:48:17 1758440897

I'm a native German speaker. "Lobhudelei" doesn't fit for me, as I would rather translate it as "schmoozing" or "glazing".

"Schönreden" is the closest I can think of on the spot.

domschl · on April 25, 2024

"The solar system's distant reaches exhibit a wealth of anomalous dynamical structure, hinting at the presence of a yet-undetected, massive trans-Neptunian body - Planet 9. Previous analyses have shown how orbital evolution induced by this object can explain the origins of a broad assortment of exotic orbits, ranging from those characterized by high perihelia to those with extreme inclinations. In this work, we shift the focus toward a more conventional class of TNOs, and consider the observed census of long-period, nearly planar, Neptune-crossing objects as a hitherto-unexplored probe of the Planet 9 hypothesis. To this end, we carry out comprehensive N−body simulations that self-consistently model gravitational perturbations from all giant planets, the Galactic tide, as well as passing stars, stemming from initial conditions that account for the primordial giant planet migration and sun's early evolution within a star cluster. Accounting for observational biases, our results reveal that the orbital architecture of this group of objects aligns closely with the predictions of the P9-inclusive model. In stark contrast, the P9-free scenario is statistically rejected at a ∼5σ confidence-level. Accordingly, this work introduces a new line of evidence supporting the existence of Planet 9 and further delineates a series of observational predictions poised for near-term resolution."

https://arxiv.org/abs/2404.11594

domschl · on Dec 7, 2023

How is the 'genitive case of Bangla word' a good example? This function's signature contains many hard-coded language specific assumptions.

- Bangla according to the function has only few irregularities, rest is algorithmic transformation of the word given

- Many languages have a huge number of irregularities

- Many languages express case information somewhere else in the sentence-construct, so it's not just a word-transform.

This very much seems to try to replicate the machine-translation efforts of foregone ages.

domschl · on Dec 6, 2023

This is (partly) outdated. MPS (metal performance shaders) are now (since torch 2.x) fully integrated in standard Pytorch releases, no external backends or special torch versions are needed.

There are few limitations left when compared with other backends. Instead of using 'cuda' device, one simply uses 'MPS' as device.

What remains is: the optimizations Pytorch provides (especially compile() with 2.1) focus on cuda and it's historic restrictions that result from CUDA being _not_ unified memory, and lots of energy goes into developing architectural work-arounds in order to limit the copying between graphics HW and CPU memory, resulting in proprietary compilers (like triton) that move parts of the python code into proprietary hardware.

Apple's unified memory would make all of those super complicated architectural workarounds mostly unnecessary (which they demonstrate with their project).

Getting current D/L platforms to support both paradigms (unified/non unified) will be a lot of work. One possible avenue is the MLIR project currently leveraged by Mojo.

sgu999 · on Dec 6, 2023

> This is (partly) outdated. MPS (metal performance shaders) are now (since torch 2.x) fully integrated in standard Pytorch releases, no external backends or special torch versions are needed.

Not sure what you're referring to, the link I provided shows how to use the "mps" backend / device from the official PyTorch release.

> lots of energy goes into developing architectural work-arounds in order to limit the copying between graphics HW and CPU memory

Does this remark apply to PyTorch running on NVidia's platforms with unified memory like the Jetsons?

hnfong · on Dec 7, 2023

Your link suggests downloading nightly previews and v1.12 torch which are both slightly out of date info.

lamontcg · on Dec 6, 2023

Practically does using unified memory mean that the slow transfer of training/testing data to the GPU would be eliminated?

domschl · on Dec 6, 2023

The project probably at least partially serves as documentation for other platforms to integrate Silicon acceleration. It basically demonstrates how to use macOS Accelerate and Metal MPS (metal performance shaders) using C++ for Machine Learning and training optimization.

Thus other platforms can simply take this backend-code and integrate it. (Pytorch basically did that already with Apple's help).

yobanate · on Dec 7, 2023

Nailed it. I think more than partially. What happens in this repo will spread to the other major frameworks and over time, clever ideas that spawn on other projects will be reimplemented with Apple's adjustments back into the repo. It's a brilliant and efficient way to interact with the community, that can likely be measured in more sales of their hardware over time.

docfort · on Dec 6, 2023

It doesn’t use MPS, so at least that part is more interesting than the usual approach.

domschl · on Dec 6, 2023

Neural engine is not helpful for training, its inference hardware, whereas this targets training and research. They use Accelerate and Metal (with seemingly similar/identical performance shaders that their Pytorch adaption uses) which allows for high performance training.

This project additionally serves as documentation for other platforms to integrate Silicon, which is good.

fmajid · on Dec 6, 2023

Still, being to run LLaMa2 on the NPU would be awesome due to the unified memory. Apple's restricting its use to only Apple-approved models is frankly irksome.

domschl · on Dec 6, 2023

The main thing about this framework is, that it uses unified memory with GPU. This gives maximum performance. Neural engine one the other hand is optimized for low-energy inference (which is mostly an advantage on mobile devices), and imposes limitations and restrictions since it's hardware supports only very specific neural network operations. Thus supporting neural engine within a universal machine learning platform doesn't make much sense, it would just be a bottleneck.

The way to use neural engine is to convert existing models that strictly adhere to the limitations of the neural engine hardware (excluding many operations used in non-restricted NN models) for use in energy-restricted inference applications only. It's a different application scenario.

PrayagBhakar · on Dec 7, 2023

Could Transformer based models been converted to work on the NPU?

reaperman · on Dec 6, 2023

Thank you for all this specific information!

LeanderK · on Dec 6, 2023

> Apple's restricting its use to only Apple-approved models is frankly irksome.

I thought you could run arbitrary networks via CoreML, there's just limited precision and maybe not every operation available?