Using Pydantic, FastAPI and OpenAPI generator tool was a decision which increased my speed of writing code by a lot. Plus it didn't hurt, that FastAPI has an excellent documentation (highly recommend the part that describes using FastAPI with SQLAlchemy).
Love FastAPI for lot of concepts that it helped to make mainstream, such as typing-based-validation, but its documentation is lacking.
I love the tutorials, but an API reference is just as important; I may not want to check out long tutorials about things I already read, but check a detailed description of each function would be super helpful - other frameworks such as Sanic have it.
I think it's in the roadmap, kudos for that, and hope that it will get a few more maintainers to speed up the process
I'm hesitant, however, since marshmallow-sqlalchemy provides full integration with your SQLAlchemy models, but pydantic-sqlalchemy only is for generating Pydantic models based on SQLAlchemy models, and it seems as if it's still experimental (Why does it have more stars? :thinking:)
Otherwise, just between pydantic and marshmallow for straight up validations, it seems Pydantic is more legible and easier to use at first sight.
Will switch if pydantic had full integration with SQLAlchemy.
And for those of you looking down on me for using ORMs (yes, i know some of you exist), I use both raw SQL and SQLAlchemy.
I find it multitudes easier to build models and deal with migrations in SQLAlchemy than writing scripts.
The creator of FastAPI and pydantic-sqlalchemy has recently released a new library: SQLModel.
https://sqlmodel.tiangolo.com
It is a thin layer on top of Pydantic and SQLAlchemy. I haven't used it yet, so can't speak out of experience, but I think it is basically exactly what you describe.
As I understood you don't, as SQLModel inherits from sqlalchemy ORM base classes. From the user guide, this is an example how to define the model and generate the table.
We use this extensively as probably half of the Python programmers on HN do. It's great, but there are plenty of alternatives out there as well, especially for performance. Our specific use case is IO limited so it's perfect for that.
I've found that using construct() to skip validation is great when a model has e.g. a `List[float]` attribute with a length of 10k+ items. But this has to be done judiciously.
Are those alternatives faster according to benchmark?
In my experience, the bottleneck is either:
- JSON parsing and dumping; the solution for me is ORJSON, fantastic wrapper to use fast JSON serialisation for most common fields, and also datetime.
- Validation - if you choose to validate your data, pydantic can indeed be slow... But it's not Pydantic the problem, but the validation that you apply to your data.
> But it's not Pydantic the problem, but the validation that you apply to your data.
Indeed, this kind of validation is usually based on 'isinstance', which is really slow in Python, because you often need to call it many times. More than once, I doubled of tripled the throughput of some data pipelines (not microbenchmarks) just by replacing 'isinstance' calls with something else.
When you really need something like isinstance, type equality sometimes works and is much faster. For example, this works as a replacement for attrs-strict's type checking on a limited subset of types (non-generic classes, Any, Optional, Tuple, and Union): https://archive.softwareheritage.org/swh:1:cnt:7f4f1ea32eace... The downside is that you can't use subclasses of the specified types.
I don’t think they’ll consider your framework at this point because it doesn’t have enough mindshare. I don’t think they’re being unfair or blacklisting you. Also, your PR was very forward and assumptive IMO, “can I get an issue number” like you’re entitled to one. Get some more users (stars being a proxy for that), and I’m sure they’ll consider you if you ask with a little more humility.
“Way faster” is a bit hyperbolic isn’t it? 2.5 times faster? That’s no order of magnitude. What validation use cases are there where it makes a difference? Very few.
Here’s the thing: if people really want fast validation and transformation, they’re probably not going to use python. People use python for its developer ergonomics and experience. Dicts as a configuration DSL are inferior to classes and type hints for very simple reasons
1. One bad thing about python is that both single and double quotes are acceptable. Dicts are built with strings, so there’s an anxiety about having to standardize in one over the other that adds to cognitive overhead.
2. There are dict literals that get rid of one string, but again, just another “choice” people don’t care to make if they don’t have to
3. Curly brackets aren’t the easiest things to type relative to other characters.
4. Curly brackets are extremely difficult to pair if you don’t have an editor that does it automatically or if your code formatter puts many brackets on the same line.
There is no need for hostility.
My comment was in your response of the benchmark that pydantic is fastest.
In their own documentation they invite all other framework to send their results.
That said what pydantic team does is up to them, the Maat was made before pydantic was a option, it has filled that usecase.
The benchmark was only added because of a internal discussion about which tool to use.
Engineering is about tradeoffs and each project will have own technical problems. Therefor there will never be the best solution, only the best solution for a particular problem within that context.
A different language, most likely. I'd use Python for this if the work being done isn't too CPU-intensive and I can afford to make big tradeoffs.
If you want to do data validation like this but for something with better performance while still retaining the benefits of a high-level GC'd language, then I'd try something like https://github.com/go-playground/validator for Go.
HN Guidelines state, among other things worth noting [1]:
> Please don't complain about tangential annoyances—things like article or website formats, name collisions, or back-button breakage. They're too common to be interesting.
Sorry if I came across as unhelpful, my intent was actually the opposite - to inform and hopefully educate. I don't like wasting what time I have left on this earth bothering people with unwelcome or unappreciated feedback (though I accept this is always a risk when engaging, especially via Internet forums where tone is not discernable).
Your argument would be strengthened if you added more explanation or insight into why projects using your disliked, cursed documentation format definitively suck :)
Feedback is always welcome, I'm always looking to improve, learn how to be more persuasive, and avoid miscommunication.
PEP 563, PEP 649 and the future of pydantic and FastAPI - https://news.ycombinator.com/item?id=26826158 - April 2021 (150 comments)
Show HN: Pydantic – Data validation using Python 3.6 type hinting - https://news.ycombinator.com/item?id=14477222 - June 2017 (27 comments)