Is this true in any meaningful sense?
For heavily studied games there's usually a theoretically optimal play independent of the opponent's interior state, this is obviously true for all the "Solved" games, which includes the simpler Heads Up Limit Hold 'Em poker (solved by Alberta's Cepheus project) but it seem pretty clearly true for as-yet unsolved games like Go and Chess too.
I'm very impressed by this achievement because I had expected good multi-player poker AI (as opposed to simple colluding bots found online making money today) to be some years away. But I would not expect "adaptability" to ever be a sensible way forward for winning a single strategy game.
That said, for this bot, I wouldn't say it's playing completely independent of the other players's interior state. Pluribus must infer its opponents strategy profile and according to the paper, maintains a distribution over possible hole cards and updates its belief according to observed actions. This is part of playing in a minimally exploitable way in such a large space for an imperfect information game.
This is what interests me. It doesn’t do this. In fact because it played against itself only, it is should be assumed that the only strategy profile it considers is its own.
In an n-player game, a table can be in a (perhaps unstable) equilibrium which the "optimal" strategy will lose at. This has been demonstrated for something as simple as iterated prisoners' dilemma (tit-for-tat is "best" for most populations, but there are populations that a tit-for-tat player will lose to). I don't play poker but I've definitely experienced that in (riichi) mahjong - if you play for high-value hands the way you would in a pro league, on a table where the other three players are going for the fastest hands possible, you will likely lose.
I would think if professional players are utilising this information, a bot could benefit from it. I don't see how they would ever lose out from this information, even if it only uses situations where the opponent has a history of 100% of the time responding a certain way.
I am impressed by the bot but I have to laugh a bit because years ago I joked with a friend about making an "amnesiac bot" that had no recollection of previous hands, it seemed so useless we obviously didn't make it, we've evidently been proven wrong. (pointless tangent there)
The theoretically optimal play just skips that meta and meta-meta play and performs optimally anyway. Because poker involves chance the optimal play will be stochastic and so you can stare at the noise and think you see a pattern, that just means you'll play worse against it, because you're trying to beat a ghost.
For example, suppose in a certain situation optimally I should raise $50 10% of the time. It so happens, by chance, that I do so twice in a row, and you, the note-taker, record that I "always" raise $50 here. Bzzt, 90% of the time your note will be wrong next time.
Now say I have thousands of hands viewed against you, and you raise pre-flop 50% of the time. That is pretty significant information about the types of hands you play. If I have only 10 hands I've observed, that same stat means nothing.
The theoretical optimal play depends on who you're playing, as more value could be extracted in certain situations vs certain players.
For example, if I've seen you face a pre-flop 3-bet 1000 times and you've folded 99% of the time. That would be a good opportunity to recognise that 3-bet bluffing this player more often would have value, and be a more optimal play than some default. Contrast playing someone who called pre-flop 3-bets 75% of the time it wouldn't be optimal to 3 bet bluff here. Different opponents, different optimal plays.
1. Coming up with an unexploitable strategy, then scaling it up by playing as many hands as you can, earning the slim expected value each time.
2. Picking a good table / card room / 'scene', and then trying to extract as much value from it as possible.
You most often see 1 online, and 2 live, for obvious reasons.
A skilled human would be a lot more successful, I believe, than a bot in case 2. For 2, important skills are:
1. Be entertaining. You have to play in a way that is entertaining to those playing with you, such that they want to continue playing with you (and losing money to you). Good opponents (i.e. that are bad at poker but want to play high stakes) are hard to find, it is vital that you retain them.
2. Cultivate a table image, then exploit it. Especially important for tournament play, where you have the concept of "key hands" that you really need to win to potentially win the tournament. With the right table image, you may be able to win hands you otherwise wouldn't have won.
3. Exploit the specifics of the players you are playing against. Yes, that also makes you exploitable, but the idea is to stay one step ahead of your opponents.
Furthermore, you can kind of account for such players by including more random or aggressive profiles in the inference/search stage.
I don't think you play very much, which is fine, but makes this discussion a bit pointless.