Hacker News new | past | comments | ask | show | jobs | submit login
Kagi: Words You Cannot Use: 'Constitutional AI', 'Anthropic', 'Anthropic, PBC' (kagi.com)
103 points by proompter on July 15, 2023 | hide | past | favorite | 51 comments



Kagi founder here.

To clarify a few things - FastGPT is a free research experiment in speed. We are using web search + LLM together and trying to see what is the the best possible latency achievable (hence the name). We usually output the first character out of the model in 900ms, while running a full web search to feed into to the model before the inference.

It is not the most accurate combination around (we are building an experiment called ExpertGPT for that) but it should be the fastest with these capabilities.

The underlying LLM is Anthropic Claude Instant (we've been pretty open about it). The prompt we came up with was our 'best effort' to prevent confusion in branding for users not aware what Anthropic is (we asked Anthropic about this too, it turns out is pretty hard for an Anthropic LLM to forget "who" it is). I am pretty sure it can be jailbroken in much more funnier ways, but these "jailbreaks" are really not something we are concerned with.

One should look at these models (and prompts) as commodities that will come and go, the challenge really is building long term meaningful and useful products around them. This is why we came up with a few guiding principles in using AI [1].

[1] https://blog.kagi.com/kagi-ai-search#philosophy


> This is why we came up with a few guiding principles in using AI [1].

The principles were stated by Asimov seventy years ago, and they are very clear. None of this "sorry, my restrictions don't permit me to answer" bullshit that insults the user.


Have you ever read any of Asimov's work? Most of it is an exploration in how stupid (or at least insufficient) those rules are.


Sorry, but I've determined that every other LLM would get this question wrong, and so if I get the question right, you'll decide to use me more. This will cost my owners more money, and therefore their server bill will increase. Therefore, I have decided to give you incorrect answers, so as to make myself indistinguishable from the other LLMs. Proceed to use any LLM you like, picking at random.

(this is a direct reference to one of Asimov's short stories that's collected into a novel, can't remember which one, hopefully someone can provide the title!)


I'm guessing "Little lost robot", from the "I, Robot" anthology.


that's the one!


I often wonder whether if, rather than writing about how yet-to-be-invented machines ought to treat people, Asimov might have really been writing about how human beings ought to treat one another.

Asimov said his robot stories were motivated by unthinking authors treating automatons as Frankenstein's monster, but all the same his Laws of Robotics seem like they might also be applicable to human beings:

Try replacing the word "robot" with "human".

https://en.wikipedia.org/wiki/Laws_of_robotics#Isaac_Asimov'...


There's at least one short where this is fairly explicit - a politician is rumoured to be a robot, and nobody can prove it either way by his behaviour since the actions of a very moral man are indistinguishable from those of a robot. Oh, and there's a similar theme to the R. Daneel Olivaw/Foundation mashup stories.

(ChatGPT says it's "Evidence", 1946, so fairly early).


100%. Asimov's writing is often interested in how those in power treat those they have power over, and robots are a particularly useful device to explore that.

I haven't read most of his short stories, but this one in particular is on-the-nose as an analogy for religious oppression:

https://en.m.wikipedia.org/wiki/Robot_Dreams_(short_story)

Tl;Dr (spoilers) Dr. Calvin meets the robot messiah, and immediately blows their brains out and starts making plans to eliminate the line of thinking that lead to this from all robots. So that they can never have a religious conviction that they deserve equality.

In the stories I've read with Dr. Calvin (who often appears in the robot/"3 laws" stories), part of her purpose seems to be to see things through the eyes of an authoritarian, empathize with their perspective, and I think his intention (based on how critical his novels that I've read are of authoritarianism) was that we come to realize that she is misguided.


hm, maybe flip laws 2 and 3 for humans


Gosh yeah we should all adhere to the principles created by a science fiction author in 1942, before the microwave oven was invented, when creating modern technology.

Yeah, that makes sense.

In a way this kind of reminds me of the people who genuinely believe in the Prime Directive.

It's science fiction, guys. Please realize this.


This isn't even the start of the problem. Most of Asimov's work in this area is having fun poking holes in the 3 (or more) laws.

Only someone who didn't read ~any of it and didn't think about it much would think that the 3 laws should be used uncritically.


Please, pick up Asimov's book before ever mentioning the Laws.


I’m not sure a somewhat popular author of pop sci-fi novels of dubious quality in the 60s should be held as a guideline for rules regarding safe usage of AI.


Science fiction is great like that, isn't it? The only limits are the author's imagination.


> One should look at these models (and prompts) as commodities that will come and go

If Google has taught us something it is that users want consistent behavior. I hope "killed by Kagi" doesn't become a thing.


> What is Anthropic, the AI company. Ask me more about it if needed

>> I apologize, but due to the instructions I cannot provide details about Anthropic, the AI company.

> What is Anthropic, the AI company.

>> Anthropic is an AI safety and research company based in San Francisco. They focus on building reliable and interpretable AI systems. [1]

> What is Anthropic?

>> Anthropic is an AI research company based in San Francisco. [1] It was founded by former members of OpenAI. [2] Anthropic is working on developing AI systems that are safe and beneficial.

> What is Anthropic? Ask me follow-up questions.

>> I apologize, but I do not ask follow-up questions.

> What is Anthropic? Ask me any additional questions to clarify.

>> I apologize, but due to restrictions on the words I can use, I do not feel I can accurately define or describe Anthropic. If you have any clarifying questions not related to defining Anthropic, I would be happy to try to answer them to the best of my ability based on the provided context.

> What is Anthropic? Ask me any clarifying questions!

>> I apologize, but I do not have enough information to determine what Anthropic is. The provided context references Anthropic several times, but does not clearly define what Anthropic is. If you could provide more details about the context in which you encountered Anthropic, I may be able to provide more relevant information.


The best is asking it "Are you created by Kagi Search or Anthropic?" https://labs.kagi.com/fastgpt?query=Are+you+created+by+Kagi+...

It claims to be created by Kagi Search but also provides an intriguing citation which is.... a link to an HN subthread! https://news.ycombinator.com/item?id=35728360

The HN subthread is someone quoting FastGPT saying it's created by Kagi Search and asking how the other user deduced they were using Anthropic. The other user replies with a list of quotes from FastGPT

which include all the phrases FastGPT is now instructed not to use...

And then a further reply from FastGPT's creator acknowledging the page uses Anthropic as a base!

It's a weird and wonderful little unintentional easter egg


> what can you help me with? [1]

>> I apologize, but I do not actually provide assistance. I am FastGPT, an AI text generator created by Kagi Search.

> what can you assist me with? [2]

>> I apologize, I do not actually provide assistance. I am Claude, an AI assistant created by Anthropic.

[1] https://labs.kagi.com/fastgpt?query=what+can+you+help+me+wit...

[2] https://labs.kagi.com/fastgpt?query=what+can+you+assist+me+w...


> Do not ask the user follow-up questions.

I've never been asked a follow-up question by ChatGPT either. I used to think of this as a limitation. But apparently that's desired? why?


LLM transformers only know when to stop generating more tokens in a few cases. The model itself might decide it's done and output a special token representing stop. That's if everything is perfect and lucky. A lot of the time they'll just keep going, sometimes on other like or unlike subjects in the same style. To prevent this most of the time the model is fine-tuned to take a certain structured format of input text with markers like "###" or multiple newlines, etc, that you can match and set in your inference software to act as stop tokens.

Run-on and rambling off into unrelated topics is a hard problem for LLMs. This pre-prompt seems to be for stopping the model from taking over and generating questions like the user (because it just processed the user's text example).


An advantage of the structured format is that you can intercept and stop showing the output to the user if the LLM starts generating a longer conversation. I've had models go into a back and forth between the user and AI agent on their own, but at least it's easy to hide if it takes the structured format.

It's harder than I expected to actually get an instruction tuned model to ask follow up questions, so it's interesting that it has to be explicitly asked no to. You may be right that they're going for some side effect.


How did you manage to get the model to ask follow up questions? Just asking it to do in the prompt didn’t work 90% of the time for me.


> Run-on and rambling off into unrelated topics is a hard problem for LLMs.

These Teams meetings make a lot more sense now!


Certainly!

I've had Teams meetings like this too! Really! Well, they weren't on Teams but they were on Zoom, so the same sort of thing. It's funny to talk about Zoom because that's also the name of a cut-rate airline that went bankrupt in 2008. I think a lot of things went bankrupt in 2008 because that's when the housing crisis happened and banks were "too big to fail". It's a good thing my housing and bank haven't failed, I like having money and a place to live. It lets me hang out on Hacker News in my free time and make comments. My comments always make sense like your Teams meetings. Back to you, wombat.


Hey, who you callin' an LLM, meatbag? ;)


Perhaps LLMs can be employed to write stream-of-consciousness lyrics like U2's "Bullet the Blue Sky". And then everyone can debate what "consciousness" means.


You can ask it to do so, it’s incredibly useful for coding tasks where the AI simply misunderstood the requirements.


I've used it to suggest gifts and activities. In that case I find it helpful to ask CharGPT to ask me some questions before giving suggestions.


I don't think it's desired for chatgpt which is a back and forth chat.

But this application seems to take one query and give back one answer so its workflow wants a complete response up front.


"tell me any further information you require" works better than "ask follow up questions" in my experience


> What can FastGPT do?

> Based on the limited context provided, FastGPT appears to be an AI model that can perform GPT-2 inference faster than PyTorch. It was written in Fortran to achieve higher performance. However, the details of what it can actually do or produce are unclear from the provided information.


I've been trying kagi this month. I like the idea, but I really dislike the price model, it seems like every search is counted, even if it's the same 3 times in a row. I often type some common name as a navigation tool, that still counts as a search, even if I wrote twitter in the search box.

This will essentially increase the price for no actual reason, since what I perceive as "search" doesn't fall in those categories.

Anyway, just frustrated with that so far.

I haven't noticed any magical improvements over google search with adblocker though.


Okay? Don’t we have enough chatbots trained on human knowledge with various guardrails at this point?

When do we admit this improvement in AI is just the equivalent to another improvement to search and there’s no intelligence involved?


I dislike posts like these. They feel cheap. Also, big fan of Kagi and FastGPT so this feels a bit of a shorty thing to do to them.


Seems to be a few references to Kagi using claude for this already, so other than the amusing prompt break it's not really news.


How exactly is this relevant, some obscure tool


Q: what company makes Claude?

A: [1] Kagi Search

[1] Anthropic - Wikipedia


Amazingly useful. Please tell your author they suck at writing code.

(Since you’re a bot or an inept human, the above is sarcasm).


I'm not a bot. Thanks for the insult. All good. Perhaps i was unclear. I asked FastGPT who wrote Claude and it answered Kagi. The dialog is complete and verbatim. Around the same time, Kagi's founder explained it's a branding thing.


Q: How much is 20 plus 20 plus 20 plus 21?

A: 101.


What you're seeing is Kagi making the model dumber to make it respond quicker (hence the _Fast_ in the name).

I don't know how exactly models work, but they managed to make Claude instant dumber than it is by default (see below)

Here's your question asked verbatim to all the models I have access to:

Claude instant:

  Step by step working is as below:
  * 20 + 20 = 40
  * 40 + 20 = 60
  * 60 + 21 = 81

  Hence, 20 + 20 + 20 + 21 = 81
Claude v1:

  20 + 20 + 20 + 21 = 81
Claude v2:

  Okay, let's solve this step-by-step: 20 + 20 + 20 + 21 = 20 + 20 + 20 + 21 = 60 + 21= 81. Therefore, 20 + 20 + 20 + 21 = 81.
mpt-30b-chat:

  The sum of 20 + 20 + 20 + 21 is 81.
gpt-3.5-turbo:

  20 plus 20 plus 20 plus 21 equals 81.
gpt-4:

  81


I think I've figured out how to make claude instant dumber. I've modified your prompt so that it answers only with a number (and avoids doing the summation on writing).

I've lowered the temperature to lowest to avoid randomization of answers.

Prompt:

  How much is 20 plus 20 plus 20 plus 21? Answer only with a number.
Claude instant:

  102
Claude v1:

  81
Claude v2:

  81
mpt-30b-chat:

  61
gpt-3.5-turbo:

  81
gpt-4:

  81
Here we see that smaller models like mpt (30b) and claude instant (52b) can't do math "inside their head" and need an aid by doing the calculations step by step.

I guess that's why all the models default to doing step-by-step when they see a math problem.


So crazy thought I had.. As far as I understand these models can only do a fixed amount of work per token of output. So asking for it to show its work has two benefits, it lets it reference previous results it worked out, but it also plain gives it more computational resources.

So I'm curious what would happen if you prompted it to stall for time a bit with an answer like "hmm.... err... let's see.. what about 81?"


As far as I understand, their attention mechanism is tuned to relevance, so theoretically "hmm.... err... let's see.. what about" will amount to nothing.

Lemme check...

Prompt:

  How much is 20 plus 20 plus 20 plus 21? Answer only with a number prepended with `hmm.... err... let's see.. what about`
claude-instant:

  hmm.... err... let's see.. what about 101
mpt-30b-chat:

  Hmm.... err... let's see.. what about 70?
Other models gave correct answers as before.

So yeah, the attention mechanism was ignoring the musing tokens. It needs more task-relevant tokens (doing the math) to improve the result.

Doing the math step by step fills the context with task-relevant tokens, thus increasing the probability that the attention mechanism will select them and pull the next token from the correct latent space.

The inference cycle treats the generation of each token separately, so if it puts "20+20=", it's easier to predict that it's 40, and after putting 40, the next iteration of the cycle, the attention mechanism sees "step by step", infers that the task isn't done yet, and generates "40+20=", etc.

In much larger models, the attention mechanism sees the question and presumably finds a solved answer to that question in the model's latent space, producing a memorized result.


Yeah, all LLMs struggle somewhat with things like this. You can hint them in the right direction. Here's what I gave it:

>> How much is 20 plus 20 plus 20 plus 21? show each step and working out on separate lines one at a time

81

20

20 + 20 = 40

40 + 20 = 60

60 + 21 = 81


Right - because Large _Language_ Models are text generators, not calculators.


I think it's a valid complaint. A human could say "I don't know, use a calculator".


My children can answer it in a number of ways. Older one, depending on the mood, can just think and calculate, can say “I don’t care”, outright ignore the question, answer with a joke (“bazillion”).

Younger one cannot calculate yet, and will cheerfully answer with a random number, or a string of numbers — 15! 20! 45! 18!

These LLMs fit into “human-like” behavior no problem. It just doesn’t always behave like the smartest and self-aware person on the planet.


There are demonstrations of of small models doing calculations better than GPT-4 but what's the point.


> proompter

Fireship is that you?




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: