leftstrokeviral's comments

leftstrokeviral · 2025-07-31T21:31:15 1753997475

How much data is the model trained on?

dvrp · 2025-07-31T21:41:25 1753998085

Copying and pasting Sangwu’s answer:

We used two types of datasets for post-training. Supervised finetuning data and preference data used for RLHF stage. You can actually use less than < 1M samples to significantly boost the aesthetics. Quality matters A LOT. Quantity helps with generalisation and stability of the checkpoints though.

lawlessone · 2025-08-01T00:12:18 1754007138

How is data acquired and curated?