I'm using these things to evaluate pitches. It's well known the default answer is "No" when seeking funding. I've been fiddling for a while. It seems like all these engines are "nice" and optimistic? I've struggled to get it to decline companies at the rate Iexpect (>80%). It's been great at extraction of the technicals.
This iteration isn't giving different results.
Anyone got tips to make the machine more blunt or aggressive even?
Positivity is still an issue, but there are some improvements that I found to work around it:
- ChatGPT works best if you remove any “personal stake” in it. For example, the best prompt I found to classify my neighborhood was one that I didn’t tell it was “my neighborhood” or “a home search for me”. Just input “You are an assistant that evaluates Google Street Maps photos…”
- I also asked it to assign a score between 0-5. It never gave a 0. It always tried to give a positive spin, so I made the 1 a 0.
- I also never received a 4 or 5 in the first run, but when I gave it what was expected from the 0 and 5, it callibrated more accurately.
Interesting challenge! I've been playing with similar LLM setups for investment analysis, and I've noticed that the default "niceness" can be a hurdle.
Have you tried explicitly framing the prompt to reward identifying risks and downsides? For example, instead of asking "Is this a good investment?", try "What are the top 3 reasons this company is likely to fail?". You might get more critical output by shifting the focus.
Another thought - maybe try adjusting the temperature or top_p sampling parameters. Lowering these values might make the model more decisive and less likely to generate optimistic scenarios.
Early experiment showed I had to keep the temp low. I'm keeping it around 0.20. from some other comments I might make a loop to wiggle around that zone.
There's the technique of model orthogonalization which can often zero out certain tendencies (most often, refusal), as demonstrated by many models on HuggingFace. There may be an existing open weights model on HuggingFace that uses orthogonalization to zero out positivity (or optimism)--or you could roll your own.
Yes, using those words. Tried even instructing that default is No.
Most repeatable results I got was to evaluate metrics and when too many were not found reject.
My feelings are it's in realm of the hallucinating that's routing the reasons towards - yea, this company could work if the stars align. It's like its stuck with the optimism of the first time investor.
Maybe simultaneously give it one or more other pitches that you consider just on the line of passing and then have it rank the pitches. If the evaluated pitch is ranked above the others, it passes. Then in a clean context tell the LLM that this pitch failed and ask for actionable advice to improve it.
Hm, I wonder if you could do something like a tournament bracket for pitches. Ask it to do pairwise evaluations between business plans/proposals. "If you could only invest in A -OR- B, which would you choose and what is your reasoning?". If you expect ~80% of pitches to be a no, then take the top ~20% of the tourney. This objective is much more neutral (one of them has to win), so hopefully the only way the model can be a "people-pleaser" is to diligently pick the better one.
Obviously, this only works if you have a decent size sample to work from. You could seed the bracket with a 20/80 mix of existing pitches that, for you, were a yes/no, and then introduce new pitches as they come in and see where they land.
This iteration isn't giving different results.
Anyone got tips to make the machine more blunt or aggressive even?