Friday, December 24, 2010

Mystery (probably) solved; how Samsung pulls off its GPU magic.

EDIT - Hmm. One of my readers (who has been an enormous resource in the past) has posted below in the comments why I am likely incorrect in my theory. Be sure to check out the comments at the end of the article!

So, despite my efforts to pull myself away from ARM architecture, Android, and specifically, the mysteries surrounding the Hummingbird processor, I can never really extract myself. One of these days I'll get around to obsessing over something else (hopefully career-related) but until then, I'll let you know what I think I've uncovered as the solution to how Samsung solved the GPU bandwidth issue (which I puzzled over in my original Hummingbird vs. Snapdragon article.)

There have been a few opportunities where I've had to step in and correct people when they post that a Galaxy S phone has only ~320 megs of RAM. It's an error I see made frequently when people use Android system info applications that can only see the 320 megs of volatile memory, despite the fact that the phone does actually contain 512 megs of RAM. We see it happen every time a new Galaxy S phone is leaked, even the Nexus S.

The explanation for this has always been that a certain amount of memory have been "reserved" by Samsung for the Android OS, and that memory is not visible nor available to applications. Despite this, I've never been able to figure out exactly how the system provides the 12.6 GB/sec of memory bandwidth it (theoretically) needs to push out 90 million triangles/sec with the PowerVR SGX540 GPU.

I'm not quite sure how it happened, but in my meanderings across the interwebs, I ran across the following image on odroid.com, of the block diagram of the S5PC110 that they use for their developer board.


Read on, there's more!

Careful observation of the POP (Package-On-Package, or "Stacked" circuits) module on the left-hand side shows 384 MB of LPDDR1 and 128MB of OneDRAM, a term I'd noticed on S5PC110 documentation on the list of supported technologies. I'd assumed that it wasn't used. I'd already determined that even though the Hummingbird supports LPDDR2, it only supports it at 400 Mbps transfer rate (which LPDDR1 is capable of) and, with an x32 bus, only allows for 1.6 GBps data bandwidth, a far cry from the 12.6 GBps needed.

So what is this OneDRAM? According to Samsung, "OneDRAM is a fusion memory chip that, can significantly increase the data processing speed between a communications processor and a media processor in mobile devices," and, "...this results resulting in a five-fold increase in the speed of cellular phone and gaming console operations, longer battery life and slimmer handset designs." (Sic.)

Hear hear! 5 times 1.6 GBps still doesn't equal 12.6, but the 12.6 number is a something I arrived at using a lot of assumptions (4.2 GBps bandwidth needed by the PowerVR SGX540 to perform 28 million triangles per second, multiplied times 3 to make ~90 million triangles per second). I'm satisfied that the OneDRAM is that holy grail memory I've been looking for.

Now, how to prove that it actually exists inside my Epic 4G? Remember, the S5PC110 Hummingbird doesn't come with memory built-in; that's something that gets stacked on when the phone is built. The ODROID guys could very well be using a completely different configuration; though that ~320 megs showing up over and over in Android system info apps hardly seems like a coincidence, assuming the difference between 384 and 320 is actually reserved memory for the OS' own system applications. The OneDRAM on the other hand would be reserved primarily for hardware use, such as the GPU as Samsung earlier suggested.

I turned to one of my Android developer acquaintences, noobnl of xda-developers.com. Noobnl is well-known within the dev community, particularly the Epic 4G branch. He's built the universal one-click root for the Epic 4G, the ClockWork Recovery installer, built the original Epic 4G Andromeda ROM that fathered many others, and is currently working on porting Cyanogenmod 6 over to the Epic 4G (and is quite close!) Anyhow, when I showed him what I've run across, (hoping to see if he'd heard of this before, as he has a good handle on Epic hardware) he told me that I had made a good find. He also pasted some kernel code that clearly referenced OneDRAM, proving that the Epic 4G contains this technology.

So there you go folks. The secret is out. The Galaxy S phones are likely able to achieve such amazing graphics performance via a 128 MB Samsung-proprietary high-speed hybrid memory solution. The remaining 384 megs of memory is plain-jane LPDDR1. The total is the promised 512 megs, and honestly, I wouldn't trade the OneDRAM for 128 megs more of LPDDR1 available application memory, but it's interesting how Samsung has kept the OneDRAM solution so quiet. It's likely enjoying the current GPU supremacy of the Galaxy S phones, unfortunately come Cortex-A9, LPDDR2 memory (> 400 Mbps), and dual-channel memory controllers, they will be back on a level playing field. Who could blame them for setting aside Orion and picking up NVIDIA's Tegra 2 SoCs for their next-gen smartphones? It's a fast-moving industry out there, particularly when you don't have Intrinsity any longer as your ace-in-the-hole. Curse you, Apple.

2 comments:

  1. Nice theory but it makes no sense and is almost certainly wrong. OneDRAM (as explained in the Samsung page you link to) is a dual-ported RAM to replace a dual-ported SRAM used for communication between two processor packages (i.e. between the modem/baseband processor and the application processor) and some or all of the SDR or DDR DRAM required for the baseband and AP. Nothing on the linked page implies or even suggests that OneDRAM is faster than LPDDR1, and in fact might even be slower. You can clearly see that OneDRAM has a 16 or 32 bit SDR or DDR1 interface on either side that is pin-compatible with commodity LPDDR1. So no, the 128 MB of OneDRAM is not for magic, it is shared by the baseband and the AP and used for the baseband->AP communication. Myth busted (sorry!).

    I personally suspect the Hummingbird gets such vastly higher GPU benchmark results through more/better/faster on chip GPU cache/tile memory/GPU working memory or whatever. Remember these tile-based GPUs use memory in an entirely different pattern than desktop GPUs and are optimized for minimal off-chip bandwidth by working in small tiles; small enough that you can keep everything in the on-chip (fast) RAM.

    (My comments are speculation of course, just like yours.)

    ReplyDelete
  2. Thanks Hugo, I appreciate your critique! You always seem to turn up when I'm off the mark and I welcome your analysis.

    In any case, I guess I misunderstood the purpose of the OneDRAM module from what I read in Samsung's documentation. I suppose that mention of improving gaming performance gave me tunnel-vision.

    I would like to see some information made public about the PowerVR graphics chips themselves... the improved onboard graphics memory would make sense, but I was under the impression that the PowerVR SGX540 used a unified memory architecture. If it is able to use its own dedicated memory, that would completely change the tables.

    ReplyDelete