Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Nvidia Canvas: AI for creating realistic landscape images (nvidia.com)
32 points by pj3677 on July 18, 2022 | hide | past | favorite | 18 comments


Nvidia's approach to software is really interesting, and demonstrates that they have a hardware culture through and through.

They could turn Canvas into a web app and charge a monthly subscription. Alternatively they could go the OpenAI GPT-3/DALL-E-2 route and give it away as a web app or API to generate a huge potential customer list. Instead, they're only interested in technology demonstrations.

I'm not arguing that this is a good _or_ bad thing. It's just interesting to watch a company drive one of the greatest innovations humankind has ever developed (AI), yet fail to capitalize on the resulting value creation due to a hardware focused culture.


They are selling RTX's via Canvas.... no more no less.


Exactly - they could think bigger than selling hardware alone.


May be they dont want to deal with us web geeks. It would burn thru their profits :p


It demonstrate that they feel that the software is better free to push their GPUs and this has a higher return than otherwise.

I'm quite lost on what you think would pay for this kind of software?


The average consumer is more willing to pay for hardware than software, even if all the value is created by the software.

It's the reason why Apple no longer charges for OS updates. Nvidia has essentially the same business model.

It's very hard to sell software to $averageconsumer unless that software is "free"


There is a large gap from tech demo to product, and it's even larger for AI products.


It’s fun to play with but most people really don’t have a reason to spend money on this kind of software.


They have a monopoly on AI hardware, so it makes perfect sense for them to make people care about AI.


Can I ask, was there an underlying reason that people deciding to pursue this image generation task, or is this literally just the result of throwing lots of tasks at different types of AI until you finally find one it seems to do well?

I don't mean to denigrate this, the results are clearly interesting, but I just don't understand what problem this solves, it just seems to raise the noise floor on reality.


There is a real use case for this type of technology.

I'm the founder of https://ayvri.com, and we have a 3D virtual world where outdoor athletes watch their activities, and the activities of others.

As the resolution (and speed) of our 3D world improved, people got more interested and engaged with it.

I believe this is the future of video. Not volumetrically created through 20+ cameras, but with a single camera capturing the scene, and AI filling in the blanks based on what it knows.


Right now, there is a whole bunch of architectures that are being discovered as being good for certain tasks.

At some point, there is going to be some sort of higher level research for ML in terms of generating an architecture for a particular task. And all this research is going to be used for this.


When you look at GAN, I could see the data from this endeavor being used to improve it's output. I get it doesn't seem to have a direct application, but I would suspect that it actually is quite valuable to the media and entertainment space in the long run.


Probably an AI that could help us re watch our dreams?


tl;dr - It’s a GAN, they have some interesting limitations but can output 1024px images in real time on a consumer gpu.

The training labels may have been “segmentation maps”. These are regions of an image with a known scene description such as “cloud”, “trees”, “sky”. I’m not certain what model they use, but I bet it is a Stylegan2/3 modified to generate an image from a given set of segmentation masks.

Indeed, without the research context, it’s a little strange “why” you would want a product like this. Nvidia has done a lot of research to get GAN to run very fast on their RTX cards due to being mostly convolutional, operating directly in pixel (or wavelet) space rather than an embedding space. On my RTX 2070, I can run Stylegan2 at 1024px at a somewhat reasonable 10 FPS.


Canvas remains my favorite way of demonstrating both the potential and the danger of AI to Boomers, parents, grandparents, and just non-techies in general.

Even the most computer illiterate of people these days are able to scribble an MS Paint landscape and have it (usually) turn into a gorgeous seascape or mountain vista.

First program/app since maybe WordLens (that old iOS real time translation overlay app) that gets consistent “wows” out of virtually everybody.


Can someone please make this, but for creating an orchestral arrangement from a piano theme (maybe with some hints?)


Would be incredible useful if this could generate HDRIs.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: