(I don't have a background (or interest) in quantum computing)
I've tried some of the "chat with pdf" websites for other topics.
It seems like they could help you understand the paper.
If you give it a try, could you report back here on how it worked for you.
I am curious. Why not give each app a private copy of common user resources? Every app has access to contacts but by default only the ones they create. Then android should allow sharing across apps based what the user wants to share. It would be a little bit tedious to share but an OS provided sharing tool can reduce that friction.
(slightly off-topic) The elephant in the room is ... Don't we need to address the issue that openai uses the data for training their future models? For many inputs this is not a big problem at all. But shouldn't there be an option (even if it requires an additional cost) that allows users to enter a mode where their input is not retained?
By default, OpenAI doesn’t train with data from the the API. They do with ChatGPT and you cannot disable it. That is why there are so many front-ends for the API now.
(my opinion) It is not predicting based on 'words/tokens'. It is transforming the general words/tokens embeddings into a context specific embedding which encodes "meaning". It is not an n-gram model of words. It is more like an n-gram model of "meaning". It doesn't encode all the "meanings" that humans are able to but with addition labelled data it should get closer. I think gpt is a component which can be combined to create AGI. Adding the API so it can use tools and allowing it to self-reflect seem like it will get closer to AGI quickly. I think allowing to read/write state will make it conscious. Creating the additional labels it needs will take time but it can do that on its own (similar to alpha-go self-play).
You are absolutely right, that's the more in depth explanation as to why it's not just an overly complicated markov chain.
At the same time, "meaning" here is essentially "close together in a big hyperdimensional space". It's meaning in the same way youtube recommendations are conceptually related by probability.
And yet, the output is nothing short of incredible for something so blunt in how it functions, much like our brains I suppose.
I'm a die-hard classical AI fan though, I like knowing the rules and that the results are provably optimal and that if I ask for a different result I can actually get a truly meaningfully different output. Not nearly as convenient as a chat bot of course, and unfortunately ChatGPT is abysmal at generating constraint problems. Maybe one day we'll get a best of both worlds.
The main issue I have with DIS is that creating the labels of my own dataset is super expensive (I think it might be easier to generate the training data using stable diffusion rather than human labelling)
It is related to subpixel labelling. When a line/curve in the foreground is smaller than a pixel you end up having to edit the mask one pixel at a time. The authors of DIS are working on a new dataset and model which should work for my use case.
BTW, I used DIS to create the labels of a batch of 20 images, I manually corrected the labels and used them to fine tune a new model. That worked well but still it took me several hours to edit labels.
I tried using stable diffusion generated labels several weeks ago but I think with controlnet and other advances I should try again.
(My dataset is about 100k images. I probably only need to label about 10k to fine tune DIS).
Selfish behavior by corporations is borderline mandatory.
Imagine dealing with customer service when any human you reach is using a LLM to generate 50x more responses than they currently can with no more ability to improve the situation. They will all 'sound' like they are helpful and human, but provide zero possibility of actual help. 'Jailbreaking' to get human attention will remain important.
Now imagine that supplied to government bureaucracies, investigative, and military what is developed for corporate.
I think this application of gpt is way more useful than the chat interface.
(Add an option for users to pay sooner).
Here are a few suggestions:
1) Allow it to take a search term, do a web search and allow the user to select from
those results.
2) Allow it to look at more than one document.
3) Detect if the output contains math formulas/graphs and render them.
(or allow me to write a javascript post processor so I can add that logic myself)
4) When a user question can't be answered, prompt the user to allow your system to web search and then include those documents.
5) Create a version that can be run locally for those of us with private data. You should charge a lot for that version (~$100+k if the customer provides the hardware, and $1m+ if you have to provide the hardware (blackbox)).
6) Detect research papers and read the citations. You may have to ask the user for a SSO key to get the citations from paywalled sites.
7) Abstract responses need to be made more concrete. See if you can train the model to provide an example or describe the purpose or intuition when it responds.