Hacker Newsnew | past | comments | ask | show | jobs | submit | Leftium's commentslogin

This was shared on HN over a decade ago, but still stands the test of time: http://ciar.org/ttk/public/apigee.web_api.pdf

Thank you!

I think it's better to have the AI write scripts that extract the data required from logs vs directly shoving the entire log content into the AI.

An example of this is: I had Claude analyze the hourly precipitation forecasts for an entire year across various cities. Claude saved the API results to .csv files, then wrote a (Python?) script to analyze the data and only output the 60-80% expected values. So this avoided putting every hourly data point (8700+ hours in a year) into the context.

Another example: At first, Claude struggled to extract a very long AI chat session to MD. So Claude only returned summaries of the chats. Later, after I installed the context mode MCP[1], Claude was able to extract the entire AI chat session verbatim, including all tool calls.

1. Sometimes?

2. Described above. I also built a tool that lets the dev/AI filter (browser dev console)logs to only the loggs of interest: https://github.com/Leftium/gg?tab=readme-ov-file#coding-agen...

3. It would be interesting to combine your log compression with the scripting approach I described.

[1]: https://hw.leftium.com/#/item/47193064


This list is a little old, but found some gems: https://web.archive.org/web/20191114220720if_/http://lazerwa...

I recall I enjoyed Hoplite, Data Wing, Mini Metro, Super Mario Run, and a few others.

---

You probably already know Apple arcade curates a set of games. Many of the 'plus' versions of games have the ad/loot box features stripped or set to "free."


A note on Super Mario Run. When this first came out I tried to play it on an airplane and it didn’t work. There was some sort of phone home to make sure it was a legit copy on launch. When it couldn’t perform this check, the game wouldn’t load.

Things could have changed since then, as this was many years ago, but something to look out for and check if this is a concern.


# I think it's possible to architecture around this. For example, here is one idea:

- make the game as functional as possible: as in the game state is stored in a serializable format. New game states are generated by combining the current game state with events (like player input, clock ticks, etc)

- the serialized game state is much more accessible to the AI because it is in the same language AI speaks: text. AI can also simulate the game by sending synthetic events (player inputs, clock ticks, etc)

- the functional serialized game architecture is also great for unit testing: a text-based game state + text-based synthetic events results in another text-based game state. Exactly what you want for unit tests. (Don't even need any mocks or harnesses!)

- the final step is rendering this game state. The part that AI has trouble with is saved for the very end. You probably want to verify the rendering and play-testing manually, but AI has been getting pretty decent at analyzing images (screenshots/renders).

# Here is an example of a simple game developed with functional architecture: https://github.com/Leftium/tictactoe/blob/main/src/index.ts

- Yes, it's very simple but the same concepts will apply to more complex games

- Right now, there is only rendering to the terminal, but you could imagine other renders for the browser and game engines


That's true - state serialization can definitely help.

> AI has been getting pretty decent at analyzing images (screenshots/renders).

I've found AI to be hit or miss on this - especially if the image is busy with lots of elements. They're really good at ad-hoc OCR but struggle more with 3d visualizations in a game that might be using WebGL.

For example, setting up the light sources (directional, lightmaps, etc) in my 3D chess game to ensure everything looked well-lit while also minimizing harsh specular reflections was something VLMs (tested with Claude and Gemini) failed pretty miserably at.

https://shahkur.specr.net


How does this compare to another beautiful way to frame time + c that really made sense to me:

- Everything is moving through space-time at c: c is not a limit; it's just the speed everything moves

- Things that don't appear to be moving in the physical dimensions have most or all of c spent in the time dimension

- Things that move very fast in the physical dimensions have little or none of c spent in the time dimension

- I think this is similar to your section explaining time dilation, but doesn't require rotation: https://lisajguo.substack.com/i/190415584/time-dilation

---

Other questions:

- Does this theory explain why we seem to only be able to travel through time in one direction? Why does the angle/direction of rotation (not) matter?


# My over-engineered console.log replacement is almost API/feature-stable: https://github.com/Leftium/gg

- Named `gg` for grep-ibility and ease of typing.

- However Claude has been inserting most calls for me (and can now read back the client-side results without any dev interaction!)

- Here is how Claude used gg to fix a layout bug in itself (gg ships with an optional dev console): https://github.com/Leftium/gg/blob/main/references/gg-consol...

---

# I've been prototyping realtime streaming transcription UX: https://rift-transcription.vercel.app

- Really want to use dictation app in addition to typing on a daily basis, but the current UX of all apps I've tried are insufficient.

---

# https://veneer.leftium.com is a thin layer over Google forms + sheets

- If you can use Google forms, you can publish a nice-looking web site with an optional form

- Example: https://www.vivimil.com

- Example: https://veneer.leftium.com/s.1RoVLit_cAJPZBeFYzSwHc7vADV_fYL...

- DEMO (feel free to try the sign up feature): https://veneer.leftium.com/g.chwbD7sLmAoLe65Z8


Seems like this one is Windows-only (even though it's Tauri?)

And it's not local (uses a cloud-based transcription API)

Also doesn't seem like it's realtime streaming, either. To get the most connected typing experience, try showing results in under a second from within the first word spoken (not after the utterance is complete)

This HN comment captures why realtime streaming is important: https://hw.leftium.com/#/item/47149479

I've also been prototyping realtime streaming transcription with multimodal input: https://rift-transcription.vercel.app


There are literally two new dictation apps on Show HN every week: https://hn.algolia.com/?dateRange=pastWeek&page=0&prefix=fal...

This one is unique in that it supports iPhone. I haven't seen mobile support very often.

Despite all these apps, there are two things holding me back from using a dictation app on a regular basis:

- streaming transcription: see words in realtime

- multimodal input: mix voice with keyboard

So I started prototyping this type of realtime multimodal dictation UX: https://rift-transcription.vercel.app

This HN comment captures why streaming is important for transcription: https://hw.leftium.com/#/item/47149479


Streaming transcription is something I’m working on. The main challenge so far has been accuracy. Streaming models, especially cloud ones, often drop enough quality that the tradeoff isn’t always worth it. Local models look more promising, so streaming will likely land there first.

On multimodal input, the UX you’re prototyping where you switch between dictating and typing while composing is interesting. I haven’t really seen that approach before.

The direction I took is a bit different. Instead of mixing modalities mid-composition, dictation becomes context-aware during post-processing. Selected/Copied text or surrounding field content can be inserted into the post-processing prompt so the spoken input is interpreted relative to what’s already on screen.


Yeah, I will add post-processing to my prototype, too. I already prepared a detailed spec (prototyping new ways to do this, as well): https://github.com/Leftium/rift-transcription/blob/main/spec...

One idea I was tossing around was streaming transcription + batch re-transcription:

- Use streaming transcription, which works most of the time (for example, I've found the Web Speech API pretty good, as well as moonshine)

- If the streaming transcription was poor, select the bad part and re-transcribe with a more accurate batch transcription model.


I tested something similar and continous re-transcription was the only way I could get close to batch-level accuracy.

In my current implementation I’m fairly aggressive with it. I don’t rely much on streaming word confidence. Instead I continuously reprocess audio using a sliding window. As new audio comes in, it’s retranscribed together with the previous segment so the model always sees a longer context.

That recovers a lot of the accuracy lost with streaming, but the amount of retranscription makes it hard to justify economically with cloud APIs. That’s why I’m focusing on a local-first approach for now.


Raycast could be both native and written with React Native:

React Native itself renders JSX as native components (not a web view that renders HTML/CSS).

People conflate React with HTML because that is the most common renderer, but React can be rendered into anything.


So validating your idea before building is better, but there is an even more "backwards" way:

You're still assuming people will be interested in one of your ideas. There is far from 100% chance of that.

To increase this chance closer to 100%: ask people what they are interested in. "Extract" the #1 problem shared by at least 10 people/businesses (that would be worth paying at least $50/month to fix). Then offer a solution to this problem.

> There are three types of problems: 1. hair-on-fire problem, 2. 2nd biggest problem, 3. everything else


Yes exactly. I read OP's post and thought, why would I use this? If I really cared I'd just copy paste links to my projects or even build a small website. It doesn't even solve any problem I have, I never struggled to share projects.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: