To evaluate semantic search we generated synthetic questions on our own repos with davinci-003. We can probably generate tougher and more realistic queries with GPT-4, would like to re-create this and open source.
We don't have precision/recall numbers for CodeSearchNet which is probably the biggest eval in this area.
We don't have precision/recall numbers for CodeSearchNet which is probably the biggest eval in this area.