Hacker News new | past | comments | ask | show | jobs | submit | mpeg's comments login

Did you also remove some roles so that it looks like you're younger? I've heard that is the only way to get interviews these days as senior cvs are getting thrown out (too old, too expensive)

I thought they weren't hiring entry level devs anymore because the higher ups think that AI can do those jobs?

They hire people in their 20s to do senior roles on the cheap, they throw away people that are 30s or older (if you’re 40 you might as well be dead) as they are seen as expensive

I've done some work in this area with large american banks (around a decade ago, so it might have changed!) and generally the only problem you could have is if you were abroad and using a new device – you wouldn't get locked out but it might trigger some extra verification steps

Yes, this is still a thing. Last year I installed Linux on an old laptop to use while traveling in SE Asia. Had problems with TD Bank because of this, complicated by my account needing SMS to the US number for the extra verification.

hnfong was referring to emergent abilities, not to "the basics" – the definition of emergent abilities as they apply to language models is that they are not present in smaller models, only in large ones.

That also implicitly defines that a model without "emergence" would not be large.

If you happen to know exactly why this happens, I'll definitely read your paper!


Prepare to be underwhelmed - no need for a paper for this one: The only emergent ability of an LLM is that the model is more robust when there are more samples. When the number of samples is in the trillions instead of thousands, there are a lot more complete concepts available for it to match to your query.

The so-called "emergent ability" of a chat-focused LLM is its accuracy, which is only possible with enough sample data, and also only possible if it's good at matching a query to the right samples, and wording a response in a way that's pleasing to the end user - something even the mainstream LLMs struggle to do well.

IME, pretty much only GPT-4 and Mistral are even that good. Most of them aren't that good at anything yet, and half the time don't follow what I ask at all. Being a very large model is only part of what makes it good.

The sad state of Hacker News is not how many people conflate "LLM" with "chatbot" but that an academic paper holds more weight than a library you can npm install right now and try. Seems we lost our way if abstract appeals to authority hold more weight than evidence before your eyes.

Thanks for your comment! I do appreciate the discussion.


Exactly my thoughts, this isn’t really an LLM. I understand it’s just a personal project but it’s probably good to know LLMs are not just lookup tables.

Also, I’m not sure I understand the explanation of why js objects are so great for this, you could do the same thing with a python dict for example or equivalent mapping types in any programming language.

This is also missing an attention mechanism which is very important to understand even for a toy LLM as it’s part of what makes the responses so accurate to the context, rather than it just predicting what word usually comes next to another word (or sequence of words)


> you could do the same thing with a python dict for example or equivalent mapping types in any programming language.

Yes, any language with a hashmap of sorts with O(1) lookups should be equivalent.


Yes the time complexity is how it can all be held in memory without crashing (because it isn't really all held in memory, like it would be with text) but that's only part of it. You could not choose "any programming language" for it because not all of them are good at lookups, and not all languages have callback functionality or the ability to natively enqueue and fork code flow. It would take forever to do this and probably not work in most languages.

The library never claims to be an LLM - it's a next token prediction library that you can use to create LLMs.

Python - one nice thing is all the "keys" retain their primitive type, where as in JS they all turn into strings. If Python wasn't so slow compared to JS that would matter a lot and I might use it, but the speed comparison isn't even close.


You must be joking, the title of this post literally says "build fast LLMs from scratch" but the code is neither an LLM, nor particularly fast.

Python dicts are actually different from JS objects, Python uses a hashmap behind the scenes while most fast js vms will apply some heuristics to decide whether to use a hashmap or to make the object as a static struct with a known offset for each key, this explains it better than I can: https://v8.dev/blog/fast-properties

Nevertheless, for your specific use-case I would be really surprised if there was any significant difference between the performance of python and js – hashmaps are fast enough for this – plus with js objects you might be trading insertion performance for access performance in some cases, as there is overhead creating a v8-style "fast property" object


Not joking. You seem to be confused in thinking LLMs are built with other LLMs, but that's not the case. Otherwise why would you say "it says 'build fast LLMs from scratch' but the code is not an LLM" ?? Why would the code of a library to build LLMs need to also be an LLM?

Getting off-topic but Python is incredibly slow at lookups (and most things) compared to JavaScript and it isn't even close, not all dynamic languages are the same. This is pretty widely known and a quick Google search yields plenty of benchmarks and articles! Give it a try.

Python is used in AI/ML for its libraries (convenience) not because it's a fast language. There are 3 main reasons: 1) Time complexity of data structures is lower in JS, that's the primary exploit at play here 2) V8 compiles to machine code in less steps than Python and 3) Process forking - the concept that functions can run in parallel in the same thread.

Thanks for your comment!


Ok, not to belittle your work or anything, but I think you are either not using the words correctly, or trolling us here.

If we used your "non-LLM" library to build an LLM, it wouldn't be "from scratch" would it? Therefore it is natural to assume that you meant that your code is supposed to be "an LLM written from scratch".


Not trolling, though I feel the same way about you with this post haha.

To answer your question, this library doesn't provide something like a `chat` method and doesn't provide anything like Stable Diffusion's `img2img` API either - that's not what it is - it's a library for predicting the next token (letter, word, pixel, etc.) based on data you train it on.

The most typical use cases for this model are: Auto-completion, auto-correct, spell check, search, etc.

However, if you watched the incredible video I shared in the original post you'd know that this completion concept can be applied to other interfaces: Chat, image generation, audio analysis/generation, etc. and in fact it is applied to LLMs like GPT.

This library doesn't get into any chat-specific methods like `ask` (question & answer), it just completes token sequences. If your goal is to create a ChatGPT clone, you would have to add methods like `chat`, `ask`, etc. and provide code for ongoing conversations - probably using an NLP library to parse queries into cursors like I'm trying in another project, or use "MULTI-HEADED ATTENTION BLOCKS" (lmfao) if you fancy that. Or if you wanted to create a Stable Diffusion clone, you'd have to provide those image-related methods needed. As far as the meaning of "from scratch" - I mean compared to using Ollama etc. to run local models or using OpenAI - just trying to enable you to build whatever model suits your data and use case.


I feel you must be trolling now by how confidently incorrect you are, but in case you are not:

> Time complexity of data structures is lower in JS

What data structures? All of them? You know you can implement your own data structures if you need them to be optimal

> V8 compiles to machine code in less steps than Python

The "steps" or time it takes to compile has no bearing on runtime performance. v8 is generally faster on micro-benchmarks, but in python you spend most of your time calling out to libraries that are heavily optimised, the javascript library ecosystem is a complete joke for ML/AI compared to python's – for example there is nothing that can compare to numpy in js.

> Process forking - the concept that functions can run in parallel in the same thread

This is a OS feature and has nothing to do with js or python. I have to point out though that when you fork a process you are creating a new thread.

This will be my last comment, it is clear to me now that you posted this thread to stroke your own ego and have no interest in actually learning anything. Good luck with your GPT-killer :)


Not joking, not trolling, thought it was widely known that JavaScript is generally significantly faster than Python. Haven't had this debate in 15 years, but let me explain:

> What data structures?

Everything in JavaScript is an object, even Arrays. So even an Array lookup is O(1) in JavaScript - not true in Python where it has to search the Array. Only if you created key/val pairs in Python (via list/hash) could you exploit the O(1) lookup for accessing data, but I can't use Python list for a large model like an LLM without hitting memory errors (see: https://stackoverflow.com/questions/5537618/memory-errors-an...)

> Python is run-time

Both languages are dynamic (not compiled) lol, what are you trying to say here? The point is that the number of steps it takes to go from high-level dynamic code (scripts) to machine-readable instructions is 1 step in JS, but 2 steps in Python, that's literally why it's slower to RUN than JavaScript. Literally runs slower as in number of executions.

> Multi-process is an OS feature that has nothing to do with JS or Python

Not true in the slightest. It's a language feature. I'll use my favorite word of the day and say it's an "exploit" more than a feature, when you can run what would be blocking IO in parallel. Python on the other hand is "single flow" executing statements one-at-a-time.

What a toxic comment, I said thanks to everyone else but not you. I retract my thank you! I hope you learned something today at least. This made me LOL: "JS uses some heuristics to decide whether to use a hashmap or to make the object as a static struct" neither hashmap nor struct exists in JS, just funny. There's ES6 Map, but that is really just an Object, not the other way around lol


> Everything in JavaScript is an object, even Arrays. So even an Array lookup is O(1) in JavaScript - not true in Python where it has to search the Array.

Huh? Array and list lookups are O(1) in Python too. Who told you that? Searching is different, and it's O(n) in both JS and Python. Can you really believe two mainstream programming languages can have such a drastic difference in time complexity?

Also, not my comment, but:

> JS uses some heuristics to decide whether to use a hashmap or to make the object as a static struct" neither hashmap nor struct exists in JS, just funny.

I'm pretty sure they meant that Javascript internally uses either a hashmap or a struct (you can do something close in Python using __slots__) to represent an object. Those are standard data structure names, even though Javascript doesn't use the same terminology. Python doesn't call them hashmaps either.


My bad it's on the insertion side, you can O(1) insert into the middle of Array-likes like Set in JS (which I use in this lib), where I think Python is O(n) for all it's Array-likes. Think you can O(1) append to an Array though.

But there are at least a few other reasons Python is generally slower than JavaScript. It's popular in ML because it was popular in math, and it was popular in math because it's easy to teach in schools. People use Python for its ML libraries, and now its ML communities, not because it's a fast language - it isn't really. And has no browser implementation, so there are a few reasons I didn't choose Python.

Reading it again I think I understand what the other poster meant now, but they also said "multi-process has nothing to do with the language" when it does have a lot to do with scaling Node especially to handle concurrency. I did a little demo here a while ago to show what I mean, check out the browser tab example: https://github.com/bennyschmidt/simple-node-multiprocess

Thanks for your comments, I hate language vs language debates! :D


> you can O(1) insert into the middle of Array-likes like Set in JS

Sets are supposed to retain the insertion order. I don't think you can insert into the middle, let alone in O(1) time (as they are hashmaps, not linked lists).

> I hate language vs language debates!

I wasn't trying to compare JS to Python. My aim was to clarify a misunderstanding re: data structures used.


In the case of `Set` and `Map` they're keys though. It's sorted in that you can reference `data[1]` and get that element, but I don't know the keys are necessarily sorted.

This guy says "V8 team explains how some memory optimization was done on its hashtable implementation for Map, Set, WeakSet, and WeakMap: Optimizing hash tables: hiding the hash code

Based on 1 and 2: V8's Set and Map's get & set & add & has time complexity practically is O(1)."

https://stackoverflow.com/questions/33611509/es6-map-and-set...


I wouldn't bother arguing with this guy, he deliberately misquoted me on every point to make it seem like what he's saying is correct

Indeed I was talking about the internals of v8, and even linked a blog post explaining it, v8 is written in c++ so it doesn't use javascript data structures anyway.


Haha dude you think Python is fast, and said multi-process has nothing to do with JavaScript, so of course I'm going to write off anything you think at this point because you know nothing about dynamic languages.

I missed the V8 bit because I was replying to a lot of people - your overall point is still terrible because JavaScript is a lot faster than Python. Think about the larger point you're making.

Btw sometimes you say "it's just an OS feature" when convenient but you don't say "it's just an interpreter feature" (regarding V8) - I wonder at what point in the debate that will come out. That's why I hate these kinds of debates, they're all signaling and changing subjects and losing sight of the point being made.

Whenever I'm talking about time complexity, I'm in a terrible conversation that I can't wait to get out of. The bottom line is everybody in the world except you knows Python is slow as hell, and for more than 1 reason.


> but it’s probably good to know LLMs are not just lookup tables.

But there are almost literally lookup tables actually!

The thing is they don't perform the lookup on a single word as a key, but on the entire context, which is what makes them special. But besides that it's “just” a lookup table.


A deep attention network is absolutely not a lookup table

An attention head is quite literally a lookup table!

Multiple lookup tables

Bit of an oversimplification though, no one would look at Postgres and go “it’s just a couple lookup tables put together”

As an implementation? No.

For fundamental understanding of the logical model? That its one big lookup table with a particular form of key nesting is... actually a pretty good model.


That's exactly what DB indexes are though ;)

Exactly right. The others here are confusing LLM with "chat bot", and also they seem to be confusing token prediction with LLM. I have a feeling the mainstream won't really get it until a ChatGPT clone is online and ready to use lol and still it will be "This isn't a true chat bot, this is actually a Markovian language interface!"

I’ve had good results in the past by fine tuning YOLO for image classification and object detection within images, you can find a rough guide here on how to create a good training dataset and such [0] [1]

YMMV though, ultimately accuracy is going to depend on the quality of the labelled data and your use case

There might be models better suited to your specific needs too, but ultimately you’re always going to need the training dataset

[0]: https://labelstud.io/blog/getting-started-with-image-classif...

[1]: https://docs.ultralytics.com/tasks/classify/


… thanks GPT

Lol yes I too can ask Ai how to Ai.

Took me some time to figure out how to run it, but the layout recogniser model hosted on huggingface is pretty good!

It correctly identifies tables that even paid models like the AWS Textract Document Analysis API fails to – for instance tables with one column which often confuse AWS even if they have a clear header and are labelled "Table" in the text.

I would however love to know broadly what kind of document it was trained on, as my results could be pure luck, hard to say without a proper benchmark

Very nice layout recognition, although I can't quite comment on the RAG performance itself – I think some of the architecture decisions are odd, it mixes a bunch of different PDF parsers for example which will all result in different quality and it's not clear to me which one it defaults to as it seems to be different in different places in the code (the simple parser defaults to pypdf2 which is not a great option)


What's the name of the layout recorgniser model? I did not have a good experience extracting layout from tables, especially those without column boundaries (space instead of lines to demarcate boundaries)


it's https://huggingface.co/InfiniFlow/deepdoc and the code for usage is in https://github.com/infiniflow/ragflow/blob/main/deepdoc/READ... – it took me a bit of trial and error to get it working

It seems to be a YOLOv8 fine-tune, I only did a couple tests but results were decent. Another model that is supposed to be fine tuned for borderless is https://huggingface.co/keremberke/yolov8m-table-extraction but I haven't had great results myself with it, but maybe worth a try for you.


Thank you very much!


Here's a quick test to run: if you have Windows and MS Office, File->Open your PDF and report the results. You might be surprised at the layout extraction quality.


This is because PDF has so many different versions. A third-party tools like pdfplumber won't fit it all. For example, using pdfplumber to parse some PDFs will cause the system to raise exceptions. Sometimes fitz works in situations where pdfplumber won't handle well. It looks a bit complicated, but RAGFlow is using multiple parsing tools to handle different types of PDFs.


I tend to use inner functions (nested def) when I need something more than a simple lambda, but don't want to come up with a good name or pollute the class / module namespace

It works okay, but I do wish there was a better way


> It works okay, but I do wish there was a better way

What's wrong with it?


nothing really, probably just me, but it always felt a bit ugly to nest function definitions, plus it can separate the (often single-use) function from the place it is being used in


It's other visitors online at the same time, your mouse pointer is sent via websockets every few seconds and received by all other online visitors. I agree it's very distracting


Dark Souls of websites


Does it export to powerpoint? I have the same issue where my client wants weekly data driven decks and I’ve automated the data part but the presentation layer I offload to other people who often make mistakes copy/pasting or with the charts

However, my client still wants a .pptx that looks great because they then present it internally, and while the current workflow isn’t great it’s very cheap and effortless (for me, at least)


I use Quarto to automate slides creation and output as PowerPoint. You can even use a premade PowerPoint template so that the output slides are pretty much done.


That tool looks really cool! We hadn't come across that one before, but we did look into a lot of embedding solutions back when we were making these kinds of presentations regularly.

I think with Thorntale we're actually trying to take a slightly different route to a solution; being the actual presentation tool is going to unlock a lot of interactivity features that exporting to powerpoint probably can't ever do. Hopefully we'll be able to show some of those off after a few more months of dev!


Nice, I didn't know about it – I took a quick look at the templating options and it seems like it might do the job. Thanks!


Thanks for the recommendation, Quarto might be just what I've been looking for.


We do plan to add export as .pptx, .pdf, and to google slides.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: