I suppose you're alluding to xkcd's joke about this [0], which is indeed a good one, but what test does this actually pass?
The approach I was thinking of is that assuming we start with the Java program:
public class Addition {
public static int add(int a, int b) {
return a + b;
}
}
We can semi-automatically generate a basic test runner with something like this, generating some example inputs automatically:
public class Addition {
public static int add(int a, int b) {
return a + b;
}
public static class AdditionAssert {
private int a;
private int b;
public AdditionAssert a(int a) {
this.a = a;
return this;
}
public AdditionAssert b(int b) {
this.b = b;
return this;
}
public void assertExpected(int expected) {
int result = add(a, b);
assert result == expected : "Expected " + expected + " but got " + result;
System.out.println("Assertion passed for " + a + " + " + b + " = " + result);
}
}
public static void main(String[] args) {
new AdditionAssert().a(5).b(3).assertExpected(8);
new AdditionAssert().a(-1).b(4).assertExpected(3);
new AdditionAssert().a(0).b(0).assertExpected(0);
System.out.println("All test cases passed.");
}
}
Another bit of automated preparation would then automatically translate the test cases to Python, and then the actual LLM would need to generate a python function until it passes all the translated test cases:
def add(a, b):
return 4
def addition_assert(a, b, expected):
result = add(a, b)
assert result == expected, f"Expected {expected} but got {result}"
addition_assert(a=5, b=3, expected=8)
addition_assert(a=-1, b=4, expected=3)
addition_assert(a=0, b=0, expected=0)
It might not be perfect, but I think it's very feasible and can get us close to there.