Hacker Newsnew | past | comments | ask | show | jobs | submit | robertcprice's commentslogin

its funny to see how many people get offended by a project I think im doing something right

Is this some kind of complex humor that I don't understand? or is it just not funny? I get it but not the punchline

Since this was posted I've been heads-down building on top of the neural CPU. Wanted to share what's new.

Built a GPU-Native UNIX OS. A full multi-process operating system running compiled C on Apple Silicon Metal:

> 25-command shell (ls, cd, cat, grep, sort, uniq, tee, cp, wc, pipes, background jobs, chaining, redirect) — ~17.5KB freestanding C compiled with aarch64-elf-gcc -O2, running entirely as ARM64 on the GPU

> Multi-process: fork/wait/pipe/dup2 via memory swapping. 1MB backing stores, up to 15 concurrent processes, round-robin scheduler, pipe blocking/wakeup, fork bomb protection, SIGTERM/SIGKILL, orphan reparenting. 28 syscalls total.

> Freestanding C runtime: malloc/free/printf/fork/wait/pipe/qsort/strtol — all on GPU

Self-hosting C compiler on Metal GPU. cc.c (~2,800 lines) compiles C→ARM64 entirely on the GPU, then executes the output on the same GPU. Three layers: host GCC → GPU compiler → GPU-compiled binary. Debugged 5 codegen bugs to get it working (UBFM encoding, LDURSW sign-extension, caller-save clobbering, array subscript type clobbering, struct lvalue handling). Supports structs, pointers, arrays, recursion, for/while/do-while, ternary, sizeof, compound assignment, bitwise, short-circuit eval. 20/20 test programs pass. Mean compile: ~50K GPU cycles. Ackermann A(3,4) runs 319K cycles of deep recursion correctly.

13+ compiled C applications on Metal:

> Crypto: SHA-256, AES-128 (ECB+CBC, 6/6 FIPS vectors pass), encrypted password vault > Games: Tetris, Snake, roguelike dungeon crawler, text adventure > VMs: Brainfuck interpreter, Forth REPL, CHIP-8 emulator > Networking: HTTP/1.0 server (TCP proxied through Python) > Neural net: MNIST classifier (784→128→10, Q8.8 fixed-point) > Tools: ed line editor, self-hosting C compiler, Game of Life

neurOS — fully neural operating system. 11 trained models running MMU (100%), TLB (99.6%), cache (99.7%), scheduler (99.2%), assembler (100%), compiler (95.2%), watchdog (100%) — zero fallback paths.

Self-compilation verified: source → neural compiler → neural assembler → neural CPU → correct results.

Timing side-channel immunity. Measured sigma=0.0000 GPU cycle variance across 270 runs of AES-128. Same code on native Apple Silicon: 47-73% CoV. No caches, no branch predictor, no speculative execution inside a dispatch. T-table timing attacks are structurally impossible.

Just reorganized the whole project — neurOS and GPU OS now live under a clean ncpu/os/ package (neuros/ and gpu/ subpackages). 850 tests passing, all verified after the reorg.

To @andreadev — the MUL>ADD inversion is still my favorite result. To @bob1029 — you're right about branchy workloads being slow (~5K IPS neural, ~4M compute), but the GPU execution model gives security properties CPUs architecturally can't provide.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: