Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is this an accurate representation of the GPT driver loop?

    def generate(prompt: str) -> str:
      # Transforms a string into a list of tokens.
      tokens = tokenize(prompt) # tokenize(prompt: str) -> list[int]
    
      while True:
     
        # Runs the algorithm.
        # Returns tokens' probabilities: a list of 50257 floats, adding up to 1.
        candidates = gpt2(tokens) # gpt2(tokens: list[int]) -> list[float]
     
        # Selects the next token from the list of candidates
        next_token = select_next_token(candidates)
        # select_next_token(candidates: list[float]) -> int
     
        # Append it to the list of tokens
        tokens.append(next_token)
     
        # Decide if we want to stop generating.
        # It can be token counter, timeout, stopword or something else.
        if should_stop_generating():
          break
 
      # Transform the list of tokens into a string
      completion = detokenize(tokens) # detokenize(tokens: list[int]) -> str
      return completion

because that looks a lot like a state machine implementing Shlemiel the painter's algorithm which throws doubt on the intrinsic compute cost of the generative exercise.


I think the "context window" that people refer to with large language models means there's a maximum number of tokens that are retained, with the oldest being discarded. The window is a sliding window.


Yes, that is the loop. All the magic is in the gpt2 function there.


This is a very small section of the algorithm. This is just how it collects the tokens it has generated into a sentence.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: