You got to give it to the pytorch team, they're really great at bringing complex optimization schemes (mixed-precision, torch.compile, etc) down to a simple to use API. I'm glad I moved from TF/Kerasto Pytorch around 2018-2019 and never looked back. I'm eager to try this as well.
I've seen and ignored a lot of "pytorch good, tensorflow bad" takes in my time, but this is so egregiously wrong I can't help but chime in. Facilitating graph-level optimizations has been one of the most central tenets of tensorflow's design philosophy since its inception. The XLA compiler was designed in close collaboration with the tensorflow team and was available in the tensorflow API as far back as 2017. It's not an exaggeration to say that pytorch is 5+ years behind on this front. Before anyone invokes the words "pythonic" or "ergonomic", I'd like to note that the tensorflow 2 API for compilation is nearly identical to torch.compile.
11. notice that there's a unicode rendering error ("'" for apostrophe) on kernel_initializer and bias_initializer default arguments in the documentation, and wonder why on earth for such a high-level API one would want to expose lora_rank as a first class construct. Also, 3 out of the 5 links in the "Used in the guide" links point to TF1 to TF2 migration articles - TF2 was released 5 years ago.
To add onto this I feel like one of the hard things about TF is that there is like at least 3 ways to do everything because they have supported multiple APIs and migrated to eager. So if you find an example or an open source project it might not be for the flavor of tensorflow that your codebase is in.
Re 6: TF/Keras team motivates random people to write long tutorials and be featured in the official site and their tutorial be included in the official guides. I have seen a lot of subpar devs/AI people write subpar tutorials and brag on twitter how their tutorials are included in the official Keras site.
Honestly, this example holds true for roughly half of the Python ecosystem; and you can square the level of frustration if it's also anything coming from Google.
(This pattern is relatively easy to understand: smart people creating something get their gratification from the creation process, not writing tedious documentation; and this is systemically embedded for people at Google, who are probably directly incentivised in a similar way.)
I feel like that with every single Google api doc. if there's a variable called x, the documentation will be "variable to store x". and you need to create/supply 5 different resources before you can create an x, but these will each require 5 further things to be figured out before you can create one of them.
Tensorflow works really well in theory. In practice a lot less so. I saw someone spend months fighting Tensorflow to convert a production model from CPU to GPU inference with any sort of efficiency. Tons of issues due to bugs across versions, deprecations of features across versions, the graph optimizer shuffling data back to the CPU for no decent reason, etc. The person had no idea what was happening or why most of the time due to how black box Tensorflow was. This was a very senior ML engineer with a lot of Tensorflow experience.
Does tensorflow have a future? I doubt it. I don't think Google is really investing many resources into it (beyond the necessary maintainence to support whatever production models still depend on it). The cost of migrating from old TF to new TF was really large, half the projects that depend on TF that I try to use just break out of the box (only 1/4 of torch projects I try fail that way).
From what I can tell Google is moving in a direction that doesn't require tensorflow, and I don't see it gaining signficant adoption outside google, so it seems most likely we will simply see it deprecated in about 10 years. It's best to see it as a transitional technology that Jeff Dean created to spur ML development internally, which was mistakenly open sourced, and now, Jeff's reports typically use Jax or other systems.
> Facilitating graph-level optimizations has been one of the most central tenets of tensorflow's design philosophy since its inception.
Agreed of course but it's not like they came up with this approach from scratch. They seem to have just picked it up from Theano (now Aesara/PyTensor).
I think tensorflow-datasets and tensorflow-serving are great, but for model development I think most people use JAX and then export it to a tensorflow SavedModel with Orbax.
But IIUC Jax also leverages XLA and for the purpose of this discussion the frontend matters only inasmuch people feel productive in using it. Whether that's TF or Jax.
Tensorflow is a lot like IBM -- it deserves praise not because it's great in its current state, but for its contributions towards advancing the broader technological front to where it is today. Tensorflow walked so JAX could run, so to speak. Frankly, I don't really draw much of a distinction between the two frameworks since I really just use them as lightweight XLA wrappers.
Tensorflow started out as anything but lightweight. In my opinion it takes the cake for kludgiest framework I've ever worked with. So verbose, so little effort put into ergonomics. Even eager mode is not really valuable unless you're working on a legacy project.
+1. As someone who has tried to migrate multiple tf.function to torch.compile, tensorflow edge is not small in this. torch.compile still is highly highly experimental. Don't believe me, just go and look into github issues as torch maintainers try to figure why torch.compile makes code very unoptimal in lot of cases, or results in incomprehensible errors.
Not sure if my experience is relevant, but I did a couple of internships in web dev during my bachelors degree in CS and quickly realized it wasn't for me. I then did a masters and now a PhD in medical imaging where I extensively use machine learning (design and train my own models, doing both supervised and RL) but I wouldn't say I am a researcher in AI/ML.
Because I am still in the academic process, I had the opportunity to take a couple of classes on the subject. Three books that I would recommend going over to make sure your foundation in ML and mathematics are solid are
-Pattern recognition and machine learning by Christopher Bishop
-Mathematics for Machine Learning by Peter Deisenroth
-Deep Learning by Courville, Bengio and Goodfellow
All three are legally available online in some form. I can't say I have any experience in finding a job related to ML though.
I have a similar schedule, although a slightly more organized. I try to be available from 9 to 4, atleast through emails if I'm not in my office, because that's when other people also work and might need me. I often leave mid-day to go running or come in late because I went bouldering in the morning. If I don't feel inspired to write or work, I often instead go for a walk and read books tangentially to my research topics. Then some days I'll work 12 hours straight.
I mostly work when it feels right, yes, and do something else when it doesn't. But I am doing my PhD and (in Canada) it absolutely does not pay enough to live comfortably.
How moronic would it be, when the actual solution, renewable energy sources, are not even discussed?
Choosing nuclear energy bears the same moronic behavior. It's unsustainable and, compared to coal, we have much less options handling it's waste on a much larger time scale.
"Not beeing able to handle the waste" is the same problem category and nobody cares. This is not just short-sightedness, it's also a propagandistic master piece.
Decades of a lacking energy transition plan and now, when the situation gets more dire, the debate goes between non-solution A and non-solution B. But at least we have status quo going for _us_.
Renewables weren't an option in those days. They've only recently started to become viable, and are not really viable at scale until the energy storage solution can also be solved at a combined cost equivalent to coal or nuclear. I do think it will get there, but it's not fair to look back at the last 70 years and say why didn't we do more renewables.
There is a lot of overlap here with the authoring process of a Word document. You don’t necessarily want the real-time coauthoring experience offered by Microsoft SharePoint or Google Docs., this can inhibit your ability to determine who is responsible for specific changes to content. Branching offers a much clearer audit trail of changes. Like with code, tags can be added to signify a minor or major version of a document is ready to be published.
While it's very cool, I'm shivering at the idea of teaching Git to non-technical people. And it still seems like a non-realtime version of Google Docs because Gdocs does offer audits
reply