Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How do you handle the context window limit? If you push the entire Dom to the LLM it will exceed the context window by far in most cases, isn't it?


My guess is you do some preprocessing on the DOM to get it down to text but still retains some structure.

Something like https://github.com/Alir3z4/html2text.

I'm sure there are other (better?) options as well.


I wrote https://markdown.download as a general helper for this


Trim unwanted html elements + convert to markdown. Significantly reduces token counts while retaining structure.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: