Just replay whatever people type whenever someone views a video.
For example, if someone starts typing something at the one minute mark, then replay his/her typing at that time.
Is it likely that anyone would respond to a comment that you make in this chat?
Perhaps. If the video is being viewed by many people simultaneously, then you might get a response from one of them.
Even if the video is not that popular, your comment is probably not that original and perhaps someone has already responded to one like it. So you would eventually see the response anyway.
But maybe we can take it further and actively look for the response and replay it earlier or repeat it if it was already replayed.
You will want to avoid duplicate/similar comments. Moreover, if there are too many comments in a given time segment, you will probably want a way to select which ones to show in that time segment.