
This will give app devs relatively similar + stable tuning targets, we'll see high dollar apps get aggressively optimized for specific chips, and as that happens power/heat will go up. If Apple follows the pattern that companies like NVIDIA have used they will keep the overall shape of the GPU similar between generations (in terms of how many threads, memory hierarchy, etc). "the concept that there is a local on-die memory pool you can read/write from with very very low perf impact is unthinkable in the current desktop GPU space" - Optimizing for the characteristics of the specific device's memory hierarchy including local/shared memory is a big part of GPU coding. "when a GPU is waiting for data, it can't just switch to work on something else" - GPU designs specifically use the idea of rotating between threads to not block on memory access. Some of the main assertions about how common GPUs work are totally backwards: This set of twitter posts is not an expert talking. I think I can recognize it when I see them talking. I've seen the experts advice on Android/Apple phone architecture / tile-based rendering. But nothing discussed in this set of twitter posts seems to match what I've often heard about iPhone/Android GPU programming. The vague advice I've heard for GPUs on mobile systems is to learn about the peculiarities of tile-based rendering, and to optimize for it specifically, because that's the main difference in GPU-architecture between Desktop GPUs and Phone GPUs. The argument simply does not track with reality, or my understanding of GPUs in general.

But I can almost assure you, even in my ignorance, that this has nothing to do with TLBs at all.

#Otomata update sucks ios app android#
I realize that mobile GPU programmers are always talking about this "tile-based rendering" business (popular on Apple / Android phones), and I've never bothered to learn the details of it. In fact, everything they say about GPUs in the thread is blatantly wrong at its face, or possibly some kind of misstatement about the peculiarity of the M1 Apple iGPU.Ĥ.
#Otomata update sucks ios app code#
If you're saying the TLB is the problem, you need to come forth with a plausible use-case why your code is hopping around RAM so much to cause page-walks.ģ. It is difficult for me to imagine any step in the shader-pipeline (vertex shader, geometry shader, pixel / fragment shader) that would go through memory so randomly as to mess up the TLB. I understand that Apple has an iGPU that shares the memory controller, but GPU-workloads are quite often linear and regular. There are a whole slew of power-configuration items on modern chips, none of them discussed in this set of twitter posts.Ģ. The temperature issues are non-sequitur, but that's where the discussion starts. But the words I _DO_ understand are being used incorrectly, which is a major red-flag.ġ. I don't know all the words that are being used in this tweet. > I don’t code for GPUs but it seemed pretty straightforward to me: misstating things in a very confusing manner and using imprecise language.Īt worst, they might be fundamentally incorrect on some of these discussion points? Maybe something weird is going on in the Apple / M1 world, but I have a suspicious feeling that maybe the person posting this Twitter thread is at best. Like, none of the logic here makes sense to me at all. (Maybe its not the answer on the GPU side of things? But at least acknowledging how we'd solve it in CPU-land would help show off how features like huge-pages could help) So I'm already feeling a bit weird that large pages / huge pages haven't been discussed in the twitter thread. The traditional answer to TLB bottlenecks is to enable large-pages or huge-page support. I admit that I'm ignorant to tile-memory but. The Twitter posts are talking about temperature at first, tile-memory.

(Others here in Hacker News suggest that the M1 uses a 16kB page, so maybe 2048 entries of 16kB each?) You'd say maybe, 8192 entries of TLB, each with 4kB pages, for instance. This sequence of posts is written very strangely and imprecisely?
