Hacker News new | past | comments | ask | show | jobs | submit | Atheb's comments login

You got to give it to the pytorch team, they're really great at bringing complex optimization schemes (mixed-precision, torch.compile, etc) down to a simple to use API. I'm glad I moved from TF/Kerasto Pytorch around 2018-2019 and never looked back. I'm eager to try this as well.

I've seen and ignored a lot of "pytorch good, tensorflow bad" takes in my time, but this is so egregiously wrong I can't help but chime in. Facilitating graph-level optimizations has been one of the most central tenets of tensorflow's design philosophy since its inception. The XLA compiler was designed in close collaboration with the tensorflow team and was available in the tensorflow API as far back as 2017. It's not an exaggeration to say that pytorch is 5+ years behind on this front. Before anyone invokes the words "pythonic" or "ergonomic", I'd like to note that the tensorflow 2 API for compilation is nearly identical to torch.compile.

it's not about the API. its about the documentation + ecosystem.

TF's doesn't seem very good. I just tried to figure out how to learn a linear mapping with TF and went through this:

1. googled "linear layer in tensorflow" and got to the page about linear.

2. spent 5 minutes trying to understand why monotonicity would be a central tenet of the documentation

3. realizing that's not the right "linear" I couldn't think of what the appropriate name would be

4. I know MLPs have them, google "tensorflow mlp example"

5. click the apr '24 page: https://www.tensorflow.org/guide/core/mlp_core

6. read through 10[!] code blocks that are basically just boiler-plate setup of data and visualizations. entirely unrelated to MLPs

7. realize they call it "dense" in tensorflow world

8. see that "dense" needs to be implemented manually

9. think that's strange, google "tensorflow dense layer"

10. find a keras API (https://www.tensorflow.org/api_docs/python/tf/keras/layers/D...)


11. notice that there's a unicode rendering error ("'" for apostrophe) on kernel_initializer and bias_initializer default arguments in the documentation, and wonder why on earth for such a high-level API one would want to expose lora_rank as a first class construct. Also, 3 out of the 5 links in the "Used in the guide" links point to TF1 to TF2 migration articles - TF2 was released 5 years ago.

To add onto this I feel like one of the hard things about TF is that there is like at least 3 ways to do everything because they have supported multiple APIs and migrated to eager. So if you find an example or an open source project it might not be for the flavor of tensorflow that your codebase is in.

Moreover, the way you find might not be the best or the most efficient way.

Re 6: TF/Keras team motivates random people to write long tutorials and be featured in the official site and their tutorial be included in the official guides. I have seen a lot of subpar devs/AI people write subpar tutorials and brag on twitter how their tutorials are included in the official Keras site.

I have seen some good ones, too, of course.


Oh god, you just gave me a flashback =) The last time I properly used TF was in early 2019, I am so happy that I don't have to deal with this anymore.

Honestly, this example holds true for roughly half of the Python ecosystem; and you can square the level of frustration if it's also anything coming from Google.

(This pattern is relatively easy to understand: smart people creating something get their gratification from the creation process, not writing tedious documentation; and this is systemically embedded for people at Google, who are probably directly incentivised in a similar way.)



I feel like that with every single Google api doc. if there's a variable called x, the documentation will be "variable to store x". and you need to create/supply 5 different resources before you can create an x, but these will each require 5 further things to be figured out before you can create one of them.

One of the reasons I am happy no longer to do Android, Github samples as "documentation".

Tensorflow works really well in theory. In practice a lot less so. I saw someone spend months fighting Tensorflow to convert a production model from CPU to GPU inference with any sort of efficiency. Tons of issues due to bugs across versions, deprecations of features across versions, the graph optimizer shuffling data back to the CPU for no decent reason, etc. The person had no idea what was happening or why most of the time due to how black box Tensorflow was. This was a very senior ML engineer with a lot of Tensorflow experience.

Does tensorflow have a future? I doubt it. I don't think Google is really investing many resources into it (beyond the necessary maintainence to support whatever production models still depend on it). The cost of migrating from old TF to new TF was really large, half the projects that depend on TF that I try to use just break out of the box (only 1/4 of torch projects I try fail that way).

From what I can tell Google is moving in a direction that doesn't require tensorflow, and I don't see it gaining signficant adoption outside google, so it seems most likely we will simply see it deprecated in about 10 years. It's best to see it as a transitional technology that Jeff Dean created to spur ML development internally, which was mistakenly open sourced, and now, Jeff's reports typically use Jax or other systems.


GP wrote "simple to use API". You can attribute many qualities to TensorFlow, but this is not one of them.

> Facilitating graph-level optimizations has been one of the most central tenets of tensorflow's design philosophy since its inception.

Agreed of course but it's not like they came up with this approach from scratch. They seem to have just picked it up from Theano (now Aesara/PyTensor).


I think tensorflow-datasets and tensorflow-serving are great, but for model development I think most people use JAX and then export it to a tensorflow SavedModel with Orbax.

But IIUC Jax also leverages XLA and for the purpose of this discussion the frontend matters only inasmuch people feel productive in using it. Whether that's TF or Jax.

I'm so sorry but Tensorflow is simply one of the worst parts of my job.

Praising XLA by defending Tensorflow of all things has to be one of the strangest takes I've ever come across.

JAX is right there. No need to beat a dead horse when there's a stallion in the stables.


Tensorflow is a lot like IBM -- it deserves praise not because it's great in its current state, but for its contributions towards advancing the broader technological front to where it is today. Tensorflow walked so JAX could run, so to speak. Frankly, I don't really draw much of a distinction between the two frameworks since I really just use them as lightweight XLA wrappers.

Tensorflow started out as anything but lightweight. In my opinion it takes the cake for kludgiest framework I've ever worked with. So verbose, so little effort put into ergonomics. Even eager mode is not really valuable unless you're working on a legacy project.

+1. As someone who has tried to migrate multiple tf.function to torch.compile, tensorflow edge is not small in this. torch.compile still is highly highly experimental. Don't believe me, just go and look into github issues as torch maintainers try to figure why torch.compile makes code very unoptimal in lot of cases, or results in incomprehensible errors.

Best way to use tensorflow is by writing models in Jax.

Not sure if my experience is relevant, but I did a couple of internships in web dev during my bachelors degree in CS and quickly realized it wasn't for me. I then did a masters and now a PhD in medical imaging where I extensively use machine learning (design and train my own models, doing both supervised and RL) but I wouldn't say I am a researcher in AI/ML.

Because I am still in the academic process, I had the opportunity to take a couple of classes on the subject. Three books that I would recommend going over to make sure your foundation in ML and mathematics are solid are

-Pattern recognition and machine learning by Christopher Bishop

-Mathematics for Machine Learning by Peter Deisenroth

-Deep Learning by Courville, Bengio and Goodfellow

All three are legally available online in some form. I can't say I have any experience in finding a job related to ML though.


I have a similar schedule, although a slightly more organized. I try to be available from 9 to 4, atleast through emails if I'm not in my office, because that's when other people also work and might need me. I often leave mid-day to go running or come in late because I went bouldering in the morning. If I don't feel inspired to write or work, I often instead go for a walk and read books tangentially to my research topics. Then some days I'll work 12 hours straight.

I mostly work when it feels right, yes, and do something else when it doesn't. But I am doing my PhD and (in Canada) it absolutely does not pay enough to live comfortably.


No need for future generations. Lots of people in this current generation think it's an absolutely poor idea to choose coal over nuclear.


How moronic would it be, when the actual solution, renewable energy sources, are not even discussed?

Choosing nuclear energy bears the same moronic behavior. It's unsustainable and, compared to coal, we have much less options handling it's waste on a much larger time scale. "Not beeing able to handle the waste" is the same problem category and nobody cares. This is not just short-sightedness, it's also a propagandistic master piece.

Decades of a lacking energy transition plan and now, when the situation gets more dire, the debate goes between non-solution A and non-solution B. But at least we have status quo going for _us_.


Renewables weren't an option in those days. They've only recently started to become viable, and are not really viable at scale until the energy storage solution can also be solved at a combined cost equivalent to coal or nuclear. I do think it will get there, but it's not fair to look back at the last 70 years and say why didn't we do more renewables.


So it seems to be able to open itself quite well: https://imgur.com/a/dkEYrFW


You used to be able to use a custom domain name on your Medium blog I think


Kinda shows how many people don't click on the article itself


Yeah it is gross and you're really not supposed to say it in a work environment.


From the article

  There is a lot of overlap here with the authoring process of a Word document. You don’t necessarily want the real-time coauthoring experience offered by Microsoft SharePoint or Google Docs., this can inhibit your ability to determine who is responsible for specific changes to content. Branching offers a much clearer audit trail of changes. Like with code, tags can be added to signify a minor or major version of a document is ready to be published.
While it's very cool, I'm shivering at the idea of teaching Git to non-technical people. And it still seems like a non-realtime version of Google Docs because Gdocs does offer audits


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: