One of my favorite applications of multimodal LLMs thus far is the ability to:
1. Draw a DAG of whatever pipeline I’m working on with pen and paper.
2. Take a photo of the graph, mistakes and all.
3. Ask ChatGPT to translate the image into mermaid.js
Given how complicated the pipelines are that I’m working with and the sloppiness of the hand drawn image, it’s truly amazing how well this workflow works.
I recently did a variation of this where instead of drawing, I just drafted a quick few bullet points and text describing at a high level what the system should do. And then I asked chat GPT to identify use cases and generate sequence diagrams for each use case in puml format (plantuml). Shockingly effective and it took about five minutes. This was a technical proposal that I shared with a few partner companies to provide a detailed plan to a customer. It came after several online meetings spaced over a few weeks of us negotiating the details. Pretty important document and it was well received. Plantuml looks decent enough that you can get away with sticking the resulting diagrams in a document.
I'm a busy person. I don't have hours of time that I can take out of my schedule to generate what I regard as write only documentation (nobody will ever read or truly value it) that ticks the box of "we have stuff to point at when somebody asks (which nobody ever will)", which has a lowish value. Sometimes it's nice to have. The above is a fine example. People will glance at it, give me a little thumbs up, and then give me permission to proceed as planned and bill accordingly. Job done. It's not a reference design that anyone will ever look at for more than a few seconds.
After a few decades in the industry, I'm extremely skeptical of the value of diagrams vs. the time required to produce them. I just don't see it. A lot of good software gets produced without them. You don't need blueprints for your blueprints, which is what source code is (a blueprint for automatically compiling into working software). People value such traits as structure, readability, conciseness in source code for a reason: it allows them to treat source code as design assets. I don't write UML, I stub out data classes and interfaces instead. And then I refactor them over and over again. Diagrams just slow me down.
But a few minutes is about on the threshold of me wasting braincycles on producing them and enrich documentation that I'm writing anyway in text form. Quickly jot down some notes. Don't waste any time whatsoever obsessing about the awkward syntax of these micro languages, and just get the essentials nailed. I bet I can get it down to like a minute or so with better LLMs and larger context windows. "Examine this project, produce an overview diagram of all the database tables". That's a prompt I'd write. In the same way, letting LLMs document code is a great use of time.
> write only documentation (nobody will ever read or truly value it)
But what's the point of producing such documentation? I could imagine that the process of creating it could be somehow beneficial (committing to memory, finding discrepancies, etc). If it's not, why can't it just be skipped?
Documentation is a tool for creating shared understanding. If you don’t need to share your understanding, don’t write docs.
Note however that sharing understanding works on the people axis and on the time axis. Docs allow you to share your current understanding with your future self. They’d better be general enough to be true then, though.
Nowadays I find Gemini pro to be able to accurately document a complex workflow within minutes just by looking at the sources and sometimes even just logs, so value of low level docs is questionable. High level requirements - essentially how it’s supposed to work and what for - is very valuable, as it allows you and the model to cross check whether things work as they were intended.
None other than ticking boxes and shutting up the people that keep asking for such things to be produced. Who then invariably don't have the attention span to do anything with the diagram. That's literally the only reason I have for creating them. Otherwise it's a tedious activity that gets in the way of developing, slows me down, and just interrupts my creative process. I usually have better things to do.
And as you might understand from what I just said, I rarely produce any diagrams. I've been active as a developer since before UML got popular and then peaked and then faded into obscurity. I still have a signed (by Martin Fowler) copy of UML distilled on a shelf somewhere gathering dust. First edition and everything. I don't think it's very valuable. Waste paper basically. But contact me if you feel otherwise. It's in pristine condition because I never did much more than thumb though it and shelve it.
25 years ago, any self respecting architect had expensive licenses for things like rational rose or visio. And they'd be fiddling with those tools for hours to produce detailed class and other diagrams. And those diagrams were as useless then as they are now. Epic waste of time. People stopped buying and using those tools. This was once a very big industry that has now imploded to next to nothing. Nobody is buying, very few people waste budget on this crap. It's a niche market with some niche revenue. Tens of millions of developers ignore these tools.
What do plantuml, mermaid, and other OSS diagramming tools have in common? The people that make them don't eat their own dogfood to document how their own software works. You can have some fun looking for diagrams in OSS projects. With few exceptions, this is not a thing (devops people seem to have a weird obsession with diagramming. And overengineering). I'm not aware of many serious OSS project where developers have bothered to document even a tiny fraction of their software with diagrams. Including all the major OSS UML diagramming tools.
The documentation for these contains plenty of examples of course (typically very simplistic). Just not any that document how the tool is designed or works. I'm not judging. I wouldn't bother either for reasons that I articulated above. But I find it ironic that even diagram tool developers don't seem to feel an urge to use diagrams for their own stuff. Makes you wonder why they bother creating the tool? You'd have to be passionate about diagramming tools but not so that you'd want to use them for your own software.
Yep, I'm in the rare disease space. "impossible" is pretty appropriate.
It's tricky. On the one hand, it's obviously not appropriate to be flippant about patient privacy. On the other, it's clearly that advancements in human health are being hindered by our current approach to (dis)allowing researchers access to data.
For me it's a situation of "once bitten, twice shy". What are the odds the medical data intended for research will be handled correctly and not used outside of its intended purpose?
What are the potential downsides to misuse of health data? Genuinely asking - I'm not sure what someone malicious would do with my health records, especially if it's anonymized.
Insurance companies refusing to pay because of $reason based on deanonymized data. Ad companies or bigPharma bombarding you to get new pills because they want to sell more pills. Black mail because you have embarrassing disease.
There's a lot of money being spent on deanonymizing data, and I would never count on it ever being able to remain anonymous with that much incentive.
There are several examples of anonymized data not being so anonymized after all and able to be traced back to the person. As far as what could someone malicious do with health records, you have a contingent of people in some states hunting down women for having abortions so that might be something you don't want getting out there. Or you might be someone in a very religious area and you don't want people finding out you're getting AIDS treatment.
I understand the theoretical concerns in these cases, but IMO it does not weigh heavily against the (conservatively) hundreds of thousands of annual deaths due to hindered medical research.
It's hard to overstress enough how impossible it is to do even basic research across institutional health datasets, even you're a giant organization with a compliance team. It's soul-draining and frankly the reason a lot of smart people jump ship and work in finance or crypto or whatever, where you can accomplish something even if it's goofy.
You're not addressing the root concern which is that healthcare is notoriously insecure. Approaching this as "who cares if things get leaked" instead of improving security of records is why getting data is impossible.
What's my or your biggest concern is irrelevant. Patient data security is why you can't get the data you say you need. That is just a fact and I would think energy is better served towards improving the handling of patient data if you want easier access to that data for research purposes.
Those are both theoretical examples of what people might want do with re-identified medical data. They are not demonstrated harms of things that happened in real life.
Develop tools for health insurance companies to abuse patients. Instead of denying coverage to patients based on real life symptoms, they can deny coverage due to model outputs that are “based” on real life data.
Since these models are black boxes, it’s easy to hide biases within them
Or worse, people with conditions similar to you have shown to develop, so we're going to charge you now for what we think you might develop later.
Same negative attached to pre-crime in policing because people that wear the same clothes, drive the same car, listen to the same music, and other sames have committed crimes, we think you will too. someday
In the USA, health plans aren't allowed to deny coverage to patients based on genetics or pre-existing conditions. They aren't stupid enough to try to break those laws. Employees can't keep a secret. And most of the claims costs are directly passed on to employers (group buyers) anyway, so the major health insurance companies have little direct incentive to deny coverage; with minimum limits on the medical loss ratio it's rather the opposite.
Companies can just not hire you for "culture fit" or something else based on leaked data about your health problems in order to keep their premium payouts low or just to avoid hiring certain types of people (fill in your blank here).
And there's the problem. In theory, this is possible. But in reality, there is no such thing.
You also assume that your data will be correct. If data integrity was so easy and common, people wouldn't be encouraged to repeatedly check their credit reports for mistakes.
One bit flip and you go from "perfectly healthy" to "about to die" and suddenly you can't get life insurance, your credit score tanks and you can't get a job.
The downsides are there, of course, and you have already been provided theoretical risks by other users. Unfortunately the discussion only ever centers around the downsides, with fear mongering aplenty, rather than treating the situation the same as any other situation in life: a risk-benefit trade off.
"Your OneMedical account by Meta-Amazon LLC has been deactivated due to suspicious activity based on analysis of your genome and online browsing habits. Please proceed to the nearest fresh location for mandatory euthanasia."
If a researcher can get the data, then so could someone else with less altruistic motives. So the good actor is slowed because of the bad actor. Unfortunately, there's very little way to prove the good is good and not crossing their fingers behind their back.
I want to second this. It seems like document chunking is the most difficult part of the pipeline at this point.
You gave the example of unstructured PDF, but there are challenges with structured docs as well. We’ve run into docs that are hard to chunk because of this deeply nested and repeated structure. For example, there might be a long experimental protocol with multiple steps; at the end of each step, there’s a table “Debugging” for troubleshooting anything that might have gone wrong in that step. The debugging table is a natural chunk, except that once chunked there are a dozen such tables that are semantically similar when decoupled from their original context and position in the tree structure of the document.
This is one example, but there are many other cases where key context for a chunk is nearby in a structured sense, but far away in the flattened document, and therefore completely lost when chunking.
Is this an example that could benefit from something like knowledge graph construction or structured entity extraction?
I'm just curious because we have theorized and seen in practice that extraction is a way to answer questions which require connected information across disparate chunks, like you can see in the simple cookbook here [https://r2r-docs.sciphi.ai/cookbooks/knowledge-graph].
Or do you think this is something that can just be solved with more advanced multimodal ingestion?
I think a LLM could be successful if it wasn't just textually aware, but also spatially. Like, we know these things just chew through forum posts like this one. Knowing where the user name ones, the body of text, submit button, etc, might be foundational in actual problem in, problem out.
Just to add to the list of this Jim Simons did and funded, he also established the Simons Foundation Autism Research Initiative (SFARI).
"SFARI’s mission is to improve the understanding, diagnosis and treatment of autism spectrum disorders by funding innovative research of the highest quality and relevance."
SFARI in turn funds a lot of foundational neurological and rare disease research, since autism is such a common phenotype.
The paper kinda leaves you hanging on the "alternatives" front, even though they have a section dedicated to it.
In addition to the _quality_ of any proposed alternative(s), computational speed also has to be a consideration. I've run into multiple situations where you want to measure similarities on the order of millions/billions of times. Especially for realtime applications (like RAG?) speed may even out weight quality.
Hey, HN! Maybe not your typical startup announcement here, but I recently left my job as a bioinformatics engineer to start a company called Lodestar Bio.
We are addressing challenges faced by families of children with rare diseases who are seeking a diagnosis, and our solution is a two-sided marketplace for rare disease genomic insights.
On one side, we will offer children who have a rare disease—and an inconclusive whole genome assay—another chance at a diagnosis. A majority of families who order a whole genome test do not receive their much needed diagnosis and are rarely provided with clear followup options. On the other side of the market, we will use the genomic data we collect to identify orphan drug leads, which we will sell to biopharma clients who are creating personalized medicines.
I'm happy to chat about any questions or comments you have!
1. Draw a DAG of whatever pipeline I’m working on with pen and paper.
2. Take a photo of the graph, mistakes and all.
3. Ask ChatGPT to translate the image into mermaid.js
Given how complicated the pipelines are that I’m working with and the sloppiness of the hand drawn image, it’s truly amazing how well this workflow works.