Your credibility is killed by thinking using an API can guarantee which model you're getting. It's entirely black box. If OpenAI wants to lie to you, they can.
The Code Spell Checker extension is great. It has proper handling for camelcase and it's fast to add words to the dictionary (cmd + .). Catches many typos when coding.
Probably not best for the last line of defence for public articles, but probably good enough.
What a comment. Why do it the easy way when the more difficult and slower way works ok it to the same result‽ For people who just want to USE models and not back at them, TheBloke is exactly the right place to go.
Like telling someone interested in 3D printing minis to build a 3D printer instead of buying one. Obviously that helps them get to their goal of printing minis faster right?
Actually, consider that the commenter may have helped un-obfuscate this world a little bit by saying that it is in fact easy. To be honest the hardest part about the local LLM scene is the absurd amount of jargon introduced - everything looks a bit more complex than it is. It’s really is easy with llama.cpp, someone even wrote a tutorial here: https://github.com/ggerganov/llama.cpp/discussions/2948 .
But yes, TheBloke tends to have conversions up very quickly as well and has made a name for himself for doing this (+more)
Sure. Suppose that we have a trivial key-value table mapping integer keys to arbitrary jsonb values:
example=> CREATE TABLE tab(k int PRIMARY KEY, data jsonb NOT NULL);
CREATE TABLE
We can fill this with heterogeneous values:
example=> INSERT INTO tab(k, data) SELECT i, format('{"mod":%s, "v%s":true}', i % 1000, i)::jsonb FROM generate_series(1,10000) q(i);
INSERT 0 10000
example=> INSERT INTO tab(k, data) SELECT i, '{"different":"abc"}'::jsonb FROM generate_series(10001,20000) q(i);
INSERT 0 10000
Now, keys in the range 1–10000 correspond to values with a JSON key "mod". We can create an index on that property of the JSON object:
example=> CREATE INDEX idx ON tab((data->'mod'));
CREATE INDEX
And we can check that the query is indexed, and only ever reads 10 rows:
example=> EXPLAIN ANALYZE SELECT k, data FROM tab WHERE data->'mod' = '7';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on tab (cost=5.06..157.71 rows=100 width=40) (actual time=0.035..0.052 rows=10 loops=1)
Recheck Cond: ((data -> 'mod'::text) = '7'::jsonb)
Heap Blocks: exact=10
-> Bitmap Index Scan on idx (cost=0.00..5.04 rows=100 width=0) (actual time=0.026..0.027 rows=10 loops=1)
Index Cond: ((data -> 'mod'::text) = '7'::jsonb)
Planning Time: 0.086 ms
Execution Time: 0.078 ms
If we did not have an index, the query would be slower:
example=> DROP INDEX idx;
DROP INDEX
example=> EXPLAIN ANALYZE SELECT k, data FROM tab WHERE data->'mod' = '7';
QUERY PLAN
---------------------------------------------------------------------------------------------------
Seq Scan on tab (cost=0.00..467.00 rows=100 width=34) (actual time=0.019..9.968 rows=10 loops=1)
Filter: ((data -> 'mod'::text) = '7'::jsonb)
Rows Removed by Filter: 19990
Planning Time: 0.157 ms
Execution Time: 9.989 ms
Hence, "arbitrary indices on derived functions of your JSONB data". So the query is fast, and there's no problem with the JSON shapes of `data` being different for different rows.
> This statement doesn't mean anything useful.
That might be true if that wasn't a well known phrase that carries decades of examples behind it.
> It also describes every social safety net.
Social safety nets are not spending millions of billions of dollars on disasters, especially disasters we've been predicting since the dawn of industry. Social safety nets have upper bounds. Being able to handle these disasters is a benefit of society, yes. Social safety nets help people get on their feet, not prevent entire towns from being wiped out. It's disingenuous to imply America's social safety net (which it doesn't have) has anything to do with this.
When individuals create liability this large, they are jailed. When corporations do it, we pretend we never could have expected this outcome.
> That might be true if that wasn't a well known phrase that carries decades of examples behind it.
It is true and there are decades of examples. It’s just meaningless because it matches every social safety net so you can use it to argue against anything you want.
> Social safety nets are not spending millions of billions of dollars on disasters, especially disasters we've been predicting since the dawn of industry.
They are. Look at the military, which is another example of everyone paying in but small groups receiving outsized benefits on many different fronts.
> Social safety nets help people get on their feet, not prevent entire towns from being wiped out.
This is what FEMA is for.
> It's disingenuous to imply America's social safety net (which it doesn't have) has anything to do with this.
You can’t even form a coherent sentence here. You can’t on one hand defend the safety net from a comparison and simultaneously claim it doesn’t exist. If it didn’t exist, you wouldn’t reply.
> When individuals create liability this large, they are jailed.
No they aren’t. People in positions of power in the government do this all of the time and receive absolutely no punishment. One regulation change can decimate people’s homes, jobs, paths to citizenship, etc.
> When corporations do it, we pretend we never could have expected this outcome.
No we don’t. It’s the reason many of the corporate insurance programs exist (superfund, fdic, etc).
A free trial with automatic billing at the end is a dark pattern. The non dark pattern would be to get approval from the customer before billing after the free trial ends.
Good to know that the prevailing commercial tech culture now sees plagiarism and stealing ideas without attribution as the modern way of doing business and hopes that dressing things up under some algorithmic veil will hide the act.
I guess the pit of moral decline has no bottom. The consolation is that theft has never been the road to wealth. Once the plundering is over the only thing that is left is a wasteland.
It seems that Microsoft has finally found a way to kill the open source "cancer".
As they say, people are unwilling to understand something if their monetary gain depends on not understanding it.
Let me break it down for you. If I ask for a visualization that squares the circle and there is one repo that has an example of squaring the circle, the LLM will "arrive" at a way of squaring the circle.
If (1) an LLM is able to arrive at solutions in the same class of difficulty as the solution for the target problem and (2) it's not possible to establish the provenance of the solution actually offered by the LLM, then what's the argument for assuming that the solution is based on IP rather than constructive reasoning?
Anecdotally, chatgpt seems much worse to me than Google for getting correct answers. Like orders of magnitude worse. Tells me the wrong timezone for a city kind of bad. No doubt it will be much better in the future, and they've definitely found PMF with the interface, but I would not trust it right now with anything even slightly important to me.
But how can you verify ChatGPT's answers if you don't know what its sources were? E.g. if I google a technical question about HTML5, I can see whether a result is the HTML5 spec, MDN, W3Schools, or a random medium blog. If I google a medical question, I can see if I'm on a hospital's website or on Men's Health.
Parent is using chatgpt as a tutor though, not as a google search.
I’d expect a tutor to give the right answer, but I wouldn’t expect google to. Chatgpt is often wrong. It’s a problem if you’re trying to learn something and using it as your tutor/truth.
I think the friction here exists outside of Mistral's control.
reply