I used Llama Factory, here are 3 core insights:
- constant lr is more stable
- 3 epochs is optimal for small batches
- lower effective batch size compensates for small dataset size
I think, for non-ML engineers it is cool product to get custom Llama for their business use-cases