A project I dream about but have neither the skills to implement nor the discipline. An entire OS written just for the RPI4. With code making use of the fact that we are running on known hardware, so we can optimize it rigorously.
If you focused on supporting the official raspberry pi keyboard, mouse, and touchscreen you could circumvent a lot of the pains around driver issues. Then people could actually get up and running with your OS with real hardware, and you could start dogfooding it.
I’ve heard people say that the software we use is up to 100 times slower than it needs to be. So my hypothesis is that if the software was written smarter, and used the fact that we know the hardware ahead of time, we should easily hit a 100 times performance increase across the OS.
Also if it were possible, it would be cool if this OS supported a minimal boot mode, that could be configured to only run the bare minimum amount of apps required for a certain piece of software. So for example a game mode that ran the game, and the bare minimum amount of OS the game needed to run.
And since we are in full on fantasy land we can take this one step further. Same basic concept, but with a RISC-V SBC, with an open GPU. Bonus points if you can get up and running with a touchscreen, keyboard, mouse, case, and SD card for $150
The statement "the software we use is up to 100 times slower than it needs to be" is not OS overhead (which is relatively small) but about user-facing application software written to optimize development speed/cost and not performance.
A reasonable game is performance focused and is spending most of the computing power directly on itself (even if perhaps not optimally and not utilizing specific GPU hardware to its full extent) and not in OS routines, so an OS providing a specific "game mode" with only the bare minimum amount of OS the game needed to run can perhaps bring a 5% or 10% performance improvement, but not 50% and definitely not 100 times performance increase.
For software that was written in a performance insensitive way (i.e. not games) you perhaps could achieve a 100 times performance increase by a full rewrite avoiding various overheads. However, you would not really need an OS change for that, the main performance-relevant changes would be in the app itself and you can get almost all of that by running the optimized app on a standard OS - but you would need a rewrite of all the actual software.
The example of this article about UnrealEngine and Vulkan is a good illustration - a game targeting the specific hardware directly could achieve the same performance (and more!), however, they are not going to, nobody is going to rewrite the game because that's expensive, so getting an OS abstraction layer like Vulkan - which inevitably adds some extra performance overhead, not reduces it - is the only reasonable way to go.
Maybe it's because I've been on Windows for too long, but people are always complaining about how terrible Windows 10 (and soon to be 11 is) as an experience (See recent slowdown with AMD processors). That and a little bit too much time watching all the cool things people did with 1mhz processors with no L1 cache or graphics, on 80s computers. As someone who wasn't around during that time, it's easy to get the feeling that a 1.5ghz quad core cpus +.5ghz gpu should absolutely slay any program, but in my experience using the RPI4 as a primary computer it hasn't felt that way.
> For software that was written in a performance insensitive way (i.e. not games) you perhaps could achieve a 100 times performance increase by a full rewrite avoiding various overheads. However, you would not really need an OS change for that, the main performance-relevant changes would be in the app itself and you can get almost all of that by running the optimized app on a standard OS - but you would need a rewrite of all the actual software.
I think this is insightful. I agree that not everything would be 100x, but I feel like at the level of an RPI 4 even a 5-10% improvement that bumps a game into 30 or 60fps territory would be noticeable. And even a 2x improvement in regular apps would have impact.
The reason you feel RPi4B aren’t performing is because it’s running software written using 2010s and 2020s software standards, languages and runtimes.
Before 2000, many many commercial games and software were written using “classical” C/C++ and had any performance critical areas written in assembly language. People would actually profile their apps and optimize their code and loops. Most any language that had large runtimes or included a garbage collector was not considered an option. Only the most complex, data oriented or GUI oriented parts of the code would be written in high level language. A good example of this mixed language code base is the Allegro 4.x Game Programming Library for DOS, Linux, and Windows.
Going back further into the 80s and 90s, many many commercial games and software were written in all assembly language. Look at the old MS DOS code bases on GitHub for a reference.
All that to say that modern software does a ton things in periphery that old software considered superfluous. We know some of it makes easier to write software, or that the software is more secure, or that it can take advantage of the features of the hardware old machines didn’t have. However, that is still extra code that needs run, often in (event) loops which multiples the amount of time dedicated to that code.
Look at codegolf or other minimal code challenges to see what could be accomplished with a different programming process.
Mouse and keyboard are as standard and supported as they can possibly be, and the official Raspberry Pi ones are ordinary USB ones supported everywhere. Touchscreens supported in Linux are supported well out of the box as well (e.g. my Dell ones). "100 times slower" sounds like nonsense to me; huge numbers of people are motivated to make software faster, so if there was that much room to improve, it mostly would have happened already.
I agree. But the context of this is around a solo developer doing this as a hobby OS. And these constraints would help the project have reasonable scope assuming no-one else expressed any interest in it.
Just checked, looks like Vulcan under DRM on the Pi4 works, and at least some people in the Libretro ecosystem have already messed around with it, so this could benefit Lakka. Awesome. Maybe this'll mean getting to play with some decent CRT shaders on the Pi without an unacceptable performance hit, and/or getting to make better use of Retroarch's advanced input lag reduction features.
CRT Royale running 60 fps in a sub 15 W machine would be impressive. 1080p would be nice, 1440p would be great, and 4K would be best. The pi4 can output 4K60, but I really doubt it can shove through that many simulated pixels.
I had decent success with PS1 games on the Pi2, which surprised me. Most games I tried ran well. I haven't tried on the Pi4 but I assume it's really good, based on that. PS2 might even work well, there.
N64 was another story, but N64 emulators have improved a bunch since last time I tried. I think the only game I found that played acceptably on the Pi2 was Mario64. Most of the rest were slideshows or didn't run at all.
Well, number one, why would they? Apple makes money by getting consumers and locking them into their unicorns and rainbows ecosystem where everything is perfect which makes consumers comfortable spending boat loads of money, not by selling commodity hardware.
Ecosystems with great UX and paid subscriptions plus a 30% cut on all transactions are far more profitable than the margins you make selling commodity hardware. Just ask famous phone manufacturers like Siemens, Nokia and Blackberry why that is. That's why SW dev salaries are much higher than HW dev salaries as the former generates way more revenue than the latter. That's why Apple doesn't roll out their own cloud datacenters and instead just gets Amazon, Microsoft and Google to compete against each other on pricing.
Apple only rolls out their solutions when they have an impact on the final UX, like designing their own M1 silicon.
And number two, selling chips comes with a lot of hassle like providing support to your partners like Intel and AMD do. Pretty sure they don't want to bother with that.
Before they start selling chips I would rather they open iMessage to other platforms to eliminate the bubble color discrimination.
> Before they start selling chips I would rather they open iMessage to other platforms to eliminate the bubble color discrimination.
Outside of the countries where iOS is on par with Android (I think US, Canada and UK are the only ones, maybe also Australia) in terms of popularity, I don't know or have seen a single person using iMessage, of course there's a lot people using iphone outside of the mentioned countries, but absolutely nobody uses iMessage.
The whole discrimination of the color bubble seems to only happen in those countries were iOS is the same or more popular than android and people is actually using iMessage.
It's worse than that in the US. While iOS is a bit over 50%, it's closing in on 90% for teens[0], where such discrimination is most likely to occur. These numbers also bode well for Apple's future market share as these teens grow into adults.
In Russia, people stopped sending each other SMS before smartphones even became mainstream. At the time they were becoming mainstream, ICQ was the instant messaging service to use, and of course there was an unofficial ICQ client for just about anything that had a screen, a keyboard, and a network interface. Also VKontakte, but that was easily accessible via Opera Mini.
Right now 99.9% of those Russians who use the internet can be reached via either VKontakte or Telegram. WhatsApp is also popular, but thankfully not around me so I was able to delete my account and never look back.
Ditto. I'm the leper who prefers to not use whatsapp and only get away with it since my partner takes up the slack, so to speak.. Last weekend she and i where bemused by the inability of our 100% APPLE hosts to: 1. Use airprint (worked from my linux phone!!!) 2. Share a file with linux or android (mp3 for a ringtone) 3. Install a ringtone. Breathtaking. I still have my classic. And si. I have more proprietary apple software on diskettes than most geeks I know. But apple tanked long ago. And hardware is cheap. And foss is fun.
I already do, in Europe, where everyone and their mom uses Facebook's WhatsApp for everything. While that evens the playing field, I'm not sure I'd call trading a walled garden for a spyware one a massive victory though.
Apparently teens and even some adults in the US where they'll miss out on social activities or be mocked or ignored due to not being on iMessage.
That doesn't affect me though as i don't live in the US and am too old for that kind of stuff but I do remember how easy it was to be mocked or bullied as a teen for not having the same stuff as the herd, even before smartphones were a thing.
It s big in the startup world too, lots of funding happened on iOS-exclusive "Clubhouse.". Black people use Android more, so it is partially back to the old racially exclusive Country Club system.
I agree with you right up to how exactly does the M1 chip affect the final UX? A different keyboard, screen, touchpad, etc. all make a difference but why does the chip make a difference?
Seems like Intel really lost the plan there with every new generation having just a few percent better performance, trouble with moving to smaller nodes and the enormous regression from spectre/meltdown.
The Apple chips are made for running macOS/iOS. Seems there are some hardware instructions that are tailor made for increasing the performance of Apple software so they can make sure everything is working toward a common goal.
They are trying to compete, and have different levers to pull with varying success. When the performance per clock or per watt levers don't work well enough, then they increase the power, and the end result is heat and inefficiency.
On the flip side, integrated solutions add another lever... writing hardware that does exactly what your software needs to improve the user experience.
AMD, ARM, and even Intel have some cool, efficient solutions, but not across their whole portfolio of products, and not at the higher ends of performance. But they are always competing, incrementing and working to get closer to that ideal.
Apple was able to focus on their exact market segment and get there rapidly.
The end users don't care what brand of chip is under the hood, or why the UX on Apple's implementation of Intel chips sucked, they just know the new device has much better UX overall due to the more powerful and more efficient chip and will upgrade for that.
Not in the x86 arena. Every time Apple gets involved with a CPU developers (Motorola, IBM, Intel) their needs splits from the developers desires. This time they decided to go on their own (well after years of doing this for the iPhone). Note: They have been involved in the ARM CPU market since the days of the Newton.
Many other manufacturers had made power-efficient ARM chips, however, the mainstream computer makers (just a few years ago including Apple) did choose x86 compatibility over power efficiency.
Just because you have money doesn't mean you have a market. Just to run a plate to create test cpus cost in the millions. All others were happy with the incremental upgrades that they were getting from ARM. Apple needed more and started creating CPUs for the iPhones a few years back
Looks like I did misunderstand, I thought they actually meant the silicon technology itself which is now available to the others and they all have designs coming using it.
Or alternately one where some Windows / Linux manufacturer could match Apple for all the innovations in the M1 Macbooks. I'm not an Apple fan but I'm envious of what they've accomplished and wish I could run Windows and Linux on similar hardware.
Other folks are starting to get there but only from the mobile device direction, e.g. Tensor. Maybe I should look closer at what Microsoft has done with ARM Surface.
It doesn't help that Apple bought the entire manufacturing capacity for 5nm silicon from TSCM right before the chip shortage hit. I think the next few years are going to get very competitive though, and I'm excited to see how Intel and AMD respond.
Apple has done that before. IIRC when the original iPod came out it used a new generation of HDD. Apple went to the drive manufacturer and said "we'll take all of them" and they agreed.
There's still 5nm silicon for sale, but just not at TSCM (the largest semiconductor manufacturer in the world). Companies like Samsung are just now getting around to mass-producing 5nm, and afaik there were a few domestic Chinese manufacturers who claimed to be on the node too.
As for Amazon specifically though, I've got no idea. They're a large enough company that they could buy out an entire fab or foundry if they wanted, AWS makes more than enough money to cover the costs.
Yeah, I always need to think twice before writing or saying their name. Same with ASML. I guess there is a reason why TLAs are much more common than FLAs.
What are the innovations in them? From everything I've heard, they just basically reverted all the changes most people hated for the last few years and slapped a new chip in there.
The "walled garden" comes with a C and C++ toolchain, python, perl, awk, sed, and a Unix shell. It is not, in any way, a "walled garden" in a universe where words have shared meaning.
Exactly, I cannot believe the Hacker News crowd are penalising you for correcting OP on not knowing that the walled garden metaphor specifically refers to the App Store, which is not an issue on MacOS.
No. That might have been where you first saw the concept applied, but a walled garden is a commercial ecosystem that is deliberately closed to foster a sense of value and exclusivity, usually in spite of no technical reason for it.
Walled gardens are inherently anti consumer market plays that make things worse for everyone except the people milking money from the idiots paying into the walled garden.
What part of MacOS is a walled garden? I can use any Bluetooth or USB device with it. I can install Linux on it. I can compile my own code on it. I can download applications from any source I please and install them.
I'm hoping Alyssa Rosenzweig's fantastic work documenting the M1 GPU will let us write native Vulkan drivers even for MacOS. I believe she's been focusing thus far on the user space visible interfaces, so a lot of that work should translate well.
That's pretty common for TBDRs. The tile is rendered into a fixed size on chip buffer, and the driver has to split the tile into multiple passes to fit all of the render target data for nutty amounts of data coming out of the shader. PowerVR works the same way (completely unsurprisingly).
It'd be surprising if an architecture had 0 such surprises and did everything Vulkan allows without any special performance considerations vs another architecture.
It's fine, but it's frankly silly that you're forced to translate a free and open graphics API into a more proprietary one. Compare that to something like DXVK, which exists because Linux users cannot license DirectX on their systems. MoltenVK exists simply because Apple thought "let's not adopt the industry-wide standard for CG graphics on our newer machines". Again, not bad, but a bit of a sticky situation that is entirely predicated by technology politics, not what's actually possible on these GPUs.
> still supports the OpenGL 1.1 ICD that they rely on?
On Windows 11, it’s OpenGL 3.3 on top of DX12, because Qualcomm doesn’t provide an OpenGL ICD at all.
> crappy drivers
Special mention to the Intel OpenGL graphics driver on Windows. If you thought that the AMD Windows one was bad, the Intel one was somehow significantly worse.
They already do, all middleware engines that actually matter, already support Metal.
Additionally iOS and Apple have much better tooling for Metal than plain DirectXTK/Pix, or that toy SDK from Khronos (that Google also uses on Android), if we compare vendor tooling.
Sounds like you don't need any help then, enjoy your 50-75% performance hit playing (a scant few) games through DirectX -> Vulkan -> Wine/Crossover ($30) -> MoltenVK -> Metal!
The games I care about enjoy native Metal and DirectX, and when I code anything graphics I don't use Khronos stuff, only on the Web, where there is no other option.
Any game dev gems have basic examples on doing an API loading layer.
Vulkan is mostly a Linux thing, and even the Switch has its own native API, NVN, it is not Vulkan nor OpenGL on the driving seat.
Why? Apple has always stated they don't want to be in an enterprise like market. It stifles innovation. While you can keeps adding features to your product you can never take away from it. Ex: x86 and Windows. Meanwhile Apple has removed entire CPU functionality from their chips since the release of the iPhone 4S. This was easy because they only had to deal with their own developers. This keeps them agile and able to change from 1 release to another.
Broadcom chips are not available on the open market and they won't sell to you unless you are an enormous company(or have a "special relationship" as RPi did). Effectively you can only buy one attached to a Pi.
Why? They gave "it's possible" proof. They rip benefits of doing it first - all good. Now it's time for competition to pick it up, possibly improve on it or fade away Intel style.
Except not a framework, it's just a library. You can use it to do any of the stuff you would do with CUDA, about as fast, but portably. #include it to accelerate your game's physics engine, or whatever.
It doesn't say so at kompute.cc, but I found that it depends on Vulkan 1.1.
What’s the test software / benchmark I should use on Linux nowadays to measure (and compare) shader and raw GPU performance? That would ideally run under both X and Wayland?
I have always tended towards Phoronixs test suite (https://www.phoronix-test-suite.com/) but Im sure there are a few specific to Vulkan around. Not sure about wayland.
Problem with this is that the application I have in mind doesn't provide anything but perceptual feedback. I'd rather have some cold numbers that are to some degree reproducible and would give at least rough idea of the performance of given HW+drivers+other-settings combination.
You can hook up a desktop graphics card to Raspberry Pi 4 compute board. It's got 2 lanes of PCIe gen 2.
Its very unlikely you can get drivers working with it though.
Today on the Jetson Nanos, you can just use the Fedora stock image. (flashed to a microSD card)
It’s much better than what it was before. nouveau works ootb, including reclocking too.
It’s also to be noted that all Tegras have an open-source kernel mode GPU driver (nvgpu) even when using the proprietary stack. However, that driver isn’t in an ideal state today.
921.6MHz is the GPU clock on Jetson Nano (at MAXN).
For the Switch:
> The GPU cores are clocked at 768 MHz when the device is docked, and in handheld mode, fluctuating between the following speeds: 307.2 MHz, 384 MHz, and 460 MHz
> Now combine this with Zink and boom! We get OpenGL 4.6 for free
For the RPi4 specifically:
That GPU has hardware limitations that make it unable of OpenGL 3.0. However, it supports GLES 3.2.
If you want GL desktop minus the unsupported features by the hardware, you can set MESA_GL_VERSION_OVERRIDE=3.3 for example. That will however never be compliant.
Vulkan has many extensions to allow it to work on hardware which doesn’t support the full feature set. (by not implementing them, instead of having only version numbers)
The Pi hardware may not support multiple render targets or other features in hardware directly but Zink is not required to (and does not always) emit 1 Vulkan API call for each OpenGL API call. It is free to issue as many as are needed to properly emulate the OpenGL API in a conformant way. That being said I don't think this particularly compatibility is in Zink today but there is nothing preventing it from being possible just because the hardware couldn't create the render targets all in one shot.
There is no hard technical requirement for hardware drivers but it's riskier to expose performance impacting emulation at that level vs the layered driver level (where Zink is). For instance imagine a case where the hardware supported 4 MRTs but the hardware driver emulation layer exposed 8 MRTs for OpenGL compatibility yet Zink needed to use 16 MRTs. Now you've got all sorts of translation happening where Zink is likely calling the lower emulation layer multiple times rather than just calling the hardware directly. Such emulation layers are expected in a layered driver, that's part of their actual intent, whereas base hardware drivers are meant to expose what the hardware is able to do natively and let you work around it otherwise.
You can already enjoy stuff like OpenGL 2.1 support on purely GLES 2.0 hardware this way - for instance on older Raspberry Pis. There's not much Zink will bring on the table that Gallium doesn't already when it comes to emulation of missing hardware features (at least not if you want them to actually perform in any reasonable way).
Ideally, shouldn't zink query the vulkan driver for what capabilities the hw has, and then expose an appropriate OpenGL version?
Unconditionally exposing the latest GL version by emulating missing GPU functionality sounds like a recipe for applications to fall off performance cliffs.
When you say render targets, do you mean drm buffers? Or on GPU output buffers?
I'm not quite completely clueless, but I have the feeling that clarification on this point will nudge me in the right direction to understanding these things better.
I've always wondered how this would work. Surely if it was possible to reasonably implement OpenGL 4.6 on the PI GPU it would already be done through Mesa.
What for? I'm sure you have some older computers that run much better that you already have in your house. An APU would trounce it. Pi feel like netbooks of desktop computers, the new ones get extremely hot, I would expect it to require a heavy heatsink and constantly spinning fan if you tried this.
The Steam Deck is a generic PC. It's not locked to the Steam store. It's not even locked to the OS it comes with. You can install the Epic store, or any other store, on it right now. If you have one, anyway.
You can download EGS games on Linux just fine, so ostensibly you could build one of these right now. Of course, you probably wouldn't want to use ARM for a PC game console, but you're welcome to try it.
If you focused on supporting the official raspberry pi keyboard, mouse, and touchscreen you could circumvent a lot of the pains around driver issues. Then people could actually get up and running with your OS with real hardware, and you could start dogfooding it.
I’ve heard people say that the software we use is up to 100 times slower than it needs to be. So my hypothesis is that if the software was written smarter, and used the fact that we know the hardware ahead of time, we should easily hit a 100 times performance increase across the OS.
Also if it were possible, it would be cool if this OS supported a minimal boot mode, that could be configured to only run the bare minimum amount of apps required for a certain piece of software. So for example a game mode that ran the game, and the bare minimum amount of OS the game needed to run.
And since we are in full on fantasy land we can take this one step further. Same basic concept, but with a RISC-V SBC, with an open GPU. Bonus points if you can get up and running with a touchscreen, keyboard, mouse, case, and SD card for $150