Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I read all the perftests in the repo. I think they nearly all parse a structure that contains a repetition of the same or similar thing a couple hundred thousand times times and the timing function returns the min and max of 5 attempts. I just picked one example for posting.

Not a Python expert, but could the Pydantic tests be possibly not realistic and/or misleading because they are using kwargs in __init__ [1] to parse the object instead of calling the parse_obj class method [2]? According to some PEPs [3], isn't Python creating a new dictionary for that parameter which would be included in the timing? That would be unfortunate if that accounted for the difference.

Something else I think about is if a performance test doesn't produce a side effect that is checked, a smart compiler or runtime could optimize the whole benchmark away. Or too easy for the CPU to do branch prediction, etc. I think I recall that happening to me in Java in the past, but probably not happened here in Python.

[1] https://github.com/ltworf/typedload/blob/37c72837e0a8fd5f350...

[2] https://docs.pydantic.dev/usage/models/#helper-functions

[3] https://peps.python.org/pep-0692/



How would you implement a benchmark then?

The kwargs thing is true… but I didn't design the API of pydantic. And it only happens on the small top level dictionary, not on all of them (unless it internally does it all the time).

Java has JIT, I agree that in that case keeping the output value is important. cpython isn't that smart. I honestly never tried to benchmark using pypy. I guess it could be interesting to try that.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: