I just open sourced a REST API and framework that you can point any of your projects to such as OpenClaw so you can do TTS and SST using your own hardware.
It's specially focused on Mac M-class chips to utilize MLX:
Internally it uses Parakeet, Kokoro TTS, Qwen3 TTS (with voice cloning support!)
Also supports creating your own API key to lock your API to your own apps.
On Sunday, I challenged myself to launch an entire project from start to finish using ChatGPT o1 Pro. I settled on building a web app to visualize the metadata (timestamp + gps) of photos and videos I've taken the last 13 years.
To do this I had to create an indexer to be able to parse photos and videos from all sorts of devices. I completed the project in 6 hours and open-sourced it on Github.
The biggest consideration with 4o vs o1-preview vs o1 vs o1 Pro is what I call “laziness.” 4o often won’t follow instructions completely or might cut corners to save CPU cycles. By contrast, o1 rarely does this, and o1 Pro never does.
Another factor is response time. The o1 series uses a “chain of thought” approach similar to human reasoning. I find o1 Pro can take up to ~1.5 minutes per response, making o1 feel lightning fast—though the extra time spent by o1 Pro usually yields more comprehensive results.
I believe IDE integrations like Cursor AI are the future. Constantly copying code between ChatGPT and my IDE isn’t ideal, especially when I have to watch for omissions in code or comments (again, but not with o1 Pro). Still, with solid version control, it’s manageable. I personally prefer the flexibility I get from Emacs + Magit (Emacs Git plugin) + ChatGPT.
The large context window in o1 Pro is a game-changer. I can paste in my entire project (excluding node_modules and other unwanted directories) so ChatGPT fully understands the structure. A quick Bash script to copy file paths, names, and contents makes this trivial.
Note that o1 Pro still occasionally has issues with using a mixture of outdated packages or cutting edge ones that I'd never use. It was a lot easier to help "wrestle" a fix though and it even found a copy of a deleted NPM packages hosted by another Github user.
Overall, I’m in awe of what’s on the horizon with o1 and o1 Pro. Sam Altman’s prediction of a single-person, billion-dollar company doesn’t seem far-fetched given the capabilities of these models.
Hope you enjoyed the thread! Here's the links if you want to visualize your old photos and videos on a map.
No ads, no trackers—just pure data insights. Check it out & contribute:
Six months ago, we set out to change Creative AI—no subscriptions, no limits, just a powerful native app you truly own. Imagine something that feels like Photoshop, works like Midjourney, and respects your privacy.
Today, we’re giving it away for free.
Discover why this is only the beginning for Sogni.
Isn't this all just armchair prophesying? Let's see some screenshots actual exploits from anyone. It's hard to gain access to someone's shell unless it's 1990 and a server is using CGI-BIN. People are retweeting that this is "WORSE THAN HEARTBLEED!!!!111!" but Heartbleed literally left practically every server susceptible. I ran sample exploit code against a number of tests hosts and saw mysql queries and passwords streaming in plain text. Yeah shellshock is a big deal but I've yet to the ground rumble and shake and Y2K x 10000 happen. This seems like a big deal but it actually isn't. Most likely no one can access your shell. Patch and move on.
I just open sourced a REST API and framework that you can point any of your projects to such as OpenClaw so you can do TTS and SST using your own hardware.
It's specially focused on Mac M-class chips to utilize MLX:
https://github.com/Sogni-AI/sogni-voice
All free and open source.
Internally it uses Parakeet, Kokoro TTS, Qwen3 TTS (with voice cloning support!) Also supports creating your own API key to lock your API to your own apps.