
Ask HN: Does GPT-3 generated texts follows Benford's law? - nerder92
As in title, i was wondering if it will be possible to detect a GPT-3 generated blog post by the fact that it might not respect Benford&#x27;s law for word frequency. I did try to do this myself using as sample the article that tricked people here on HN https:&#x2F;&#x2F;adolos.substack.com&#x2F;p&#x2F;feeling-unproductive-maybe-you-should and it actually seems to have a weird word frequency pattern compared with other human generated articles, but i&#x27;m not so sure of my findings even because the sample is quite small in order to get to a conclusion. Is this makes sense? I would be nice if someone could help me figure it out.
======
bjourne
Benford's law doesn't apply to word occurrences. Analyzing word frequencies
(1-grams) genereally don't work because it overlooks the order of words.
Shuffling the words of this comment doesn't affect 1-gram frequencies yet
turns it into gibberish.

------
ksaj
Interesting thinking. People have fairly predictable / individual word usage
patterns and modes. GPT-3 is trained on the words of a whole lot more than one
person, so that would probably skew a word frequency analysis quite a lot -
even on a paragraph by paragraph basis.

