Welcome! Log In Create A New Profile

Advanced

Wii GPU is similar to ATI Rv610

Posted by tapionvslink 
Wii GPU is similar to ATI Rv610
March 12, 2010 09:17PM
The Wii Hollywood is capable of GPGPU natively and can be comparable to an ATI R6v610 since it can do HDR+AA in

games like Monster Hunter 3, Resident Evil DarkSide Chronicles, Silent Hill, Cursed Mountain, etc; HDR+AA was not

possible even in the Nvidia 7 series since HDR required a lot of power and coupling it with AA(antialaising) was out of question

until the new architecture that came in the ATI x1000 family showed that it was possible even though these cards

only had 16 pipelines against the 24 pipelines of the Nvidia 7 series. The ATI x1000 were also the first GPU capable

of GPGPU natively.


ATI Radeon x1000 and Nvidia 8800 first gpu´s capable of gpgpu natively.-
according to this report, gpgpu is not feasible in ATI Radeons below the ati x1000 series(R520 and R580)

[news.softpedia.com]

"
Among the GPGPU capable ATI and Nvidia chips there are those of the latter generation like the GeForce 8800 and

Radeon X1000 and HD 2000s, but older chips can too be used but as they are slower don't hold your breath waiting for

results. Unlike the processors that are capable only of a limited degree of parallelism, being able to execute a few

SIMD instructions per clock cycle, the latest graphics chips can process hundreds. This feature is hardware

implemented in all newer video chips and it is related to the fact that such a chip is made up by many identical

processing units like the pixel shading units. For example, ATI's Radeon X1900 chip has 48 such units, while the

GeForce 8800 numbers 128, so their parallel processing capabilities are very high.
"



ATI Radeon x1000 family HDR+Antialaising

[www.xbitlabs.com]

"

HDR: Speed AND Quality
The new generation of ATI’s graphics processors fully supports high dynamic range display modes, known under the

common name HDR.

One HDR mode was already available in RADEON X800 family processors, but game developers didn’t appreciate that

feature much. We also described HDR in detail in our review of the NV40 processor which supported the OpenEXR

standard with 16-bit floating point color representation developed by Industrial Light & Magic (for details see our

article called NVIDIA GeForce 6800 Ultra and GeForce 6800: NV40 Enters the Scene ).

OpenEXR was chosen as a standard widely employed in the cinema industry to create special effects in movies, but PC

game developers remained rather indifferent. The 3D shooter Far Cry long remained the only game to support OpenEXR

and even this game suffered a tremendous performance hit in the HDR mode. Resolutions above 1024x768 were absolutely

unplayable. Moreover, the specifics of the implementation of HDR in NVIDIA’s graphics architecture made it

impossible to use full-screen antialiasing in this mode (on the other hand, FSAA would just result in an even bigger

performance hit). The later released GeForce 7800 GTX, however, had enough speed to allow using OpenEXR with some

comfort, but it still didn’t permit to combine it with FSAA.

ATI Technologies took its previous experience into account when developing the new architecture and the RADEON X1000

acquired widest HDR-related capabilities, with various – and even custom – formats. The RADEON X1000 GPUs also

allows you to use HDR along with full-screen antialiasing. This is of course a big step forward since the NVIDIA

GeForce 6/7, but do the new GPUs have enough performance to ensure a comfortable speed in the new HDR modes? We’ll

only know this after we test them, but at least we know now why the R520 chip, the senior model in ATI’s new GPU

series, came out more complex than the NVIDIA G70. The above-described architectural innovations each required its

own portion of transistors in the die. As a result, the R520 consists of 320 million transistors – the most complex

graphics processor today! – although it has 16 pixel pipelines against the G70’s 24.

"

ATI R520(gpu found in the ATI Radeon x1000 series) first GPU that supported HDR+AA
[en.wikipedia.org]

"
ATI's DirectX 9.0c series of graphics cards, with complete Shader Model 3.0 support. Launched in October 2005, this

series brought a number of enhancements including the floating point render target technology necessary for HDR

rendering with anti-aliasing. Cards released include X1300 - X1950.
"

The ATI Hollywood was finished later than June 2006, when NEC announced that would provide the EDRAM to Wii with

colaboration of Mosys.



Monster Hunter 3 of Wii shows us HDR+AA
[news.vgchartz.com]

"

Monster Hunter 3 demo impressions
By Nick Roussounelos 14th Jun 2009
Nintendo, 3,525 views

We tried out the Monster Hunter 3 demo bundled with the Wii version of Monster Hunter G. Quest Complete!

Monster Hunter 3 (~Tri) Preview

Upon the series debut on PS2 back in 2004 the public reaction to the Monster Hunter series was, at best, mediocre.

Someone, who is unquestionably very happy now, proposed porting Monster Hunter to Sony's portable wonder and here we

are, with MH2G climbining to 3 million and counting! Now, let's take a sneak peek at how things fare for the next

iteration of the franchise, Monster Hunter 3, developed from the ground up on Wii, through the free demo bundled

with the system's own version of Monster Hunter G which was recently released. Before we start, let me make it clear

that I don't speak Japanese, so don't ask for any story coverage.

The demo consists of two quests, hunt either a Kurubekko (little wyvern much like Kut-Ku) or an alpha carnivore,

Dosjagi. You can choose between 5 weapons: Great Sword, Sword & Shield, Hammer, Light Crossbow and Heavy Crossbow.

Make your selections and off you go! The first thing you'll notice is the spectacular graphics; no, that's no

hyperbole. The lighting, the texture detail, anti-aliasing (yes, you read that right, no jaggies), HDR (High dynamic

range rendering for us geeks) and other technologies make this one of the prettiest Wii games on the market, period.


PSP Monster Hunter games looked phenomenal on the platform and pushed the hardware beyond what was thought possible,

Capcom made no compromises this time either. The game also features a great musical score, which is much more epic

in scope than previous iterations. Loading times between each areas take 1-2 seconds tops, nothing disturbing, but

not very pleasant either.


"

There are also other games in Wii that have shown HDR+AA like Cursed Mountain, Resident Evil DarkSide Chronicles,

Silent Hill, and seems that Xenoblade will also too. On purpose, you should also see games like Zangeki no

Reginleiv, whatch gametrailers.com. There is also something you should know, HDR cannot be done without progammable

shadders.


I have also read that HDR is not possible without programmable shaders.-
[courses.csusm.edu]

"
OpenGL fixed - functions cannot be used for HDR.

The programmable pipeline must be used to achieve HDR lighting effects.

HDR is achieved in OpenGL using a fragment shader, written in GLSL or Cg.

"

On purpose, Zelda Twilight did not use HDR, that was a simulating HDR, just compare Zelda to Monster Hunter 3 images and

you will see the difference without difficulty.


I am sure that must developers that say that Wii has no programmable shaders just say that becasue they know Directx

Shader Model, but not OpenGL Shading Language, since nintendo makes use of an API called GX which is very similar to

Opengl, and there have been also succesful attempts of using wrappers like the mesa(opengl) driver.

[www.phoronix.com]
"
A Mesa (OpenGL) Driver For The Nintendo Wii?
Posted by Michael Larabel on January 28, 2009
There is now talk on the Mesa 3D development list about the possibility of having a Mesa driver for the Nintendo

Wii. Those working on developing custom games for this console platform have already experienced some success in

bringing OpenGL to the Wii through the use of Mesa.

Nintendo has its own graphics API (GX) for the Wii, which is resemblant of OpenGL but still different enough that

some work is required to get OpenGL running. A way to handle this though is by having a Mesa driver translate the

OpenGL calls into the Wii's GX API. This is similar to gl2gx, which is an open-source project that serves as a

software wrapper for the Nintendo Wii and GameCube.

Their first OpenGL instructions running on the Nintendo Wii is shown below (it's not much, but at least it's better

than seeing glxgears

[www.phoronix.net]

"

Or perhaps could be because they are making use of the first SDK´s that came out even before the Hollywood´s design

was finished.


There are many engines that support programmable shader for Wii like:

Unity for Wii.- [unity3d.com]
"
Scriptable Shaders
Unity's ShaderLab system has been expanded to unlock the full power of The Wii console's graphics chip. Use one of

the built-in Wii-optimized shaders or write your own. Script and modify at runtime any shader on any objects in any

way you like.

"

Gamebryo LighSpeed for Wii.-

[www.emergent.net]

Gamebryo 2.6 Features:

Full tools capabilities for Nintendo Wii:

Wii Terrain: Emergent Terrain System is now part of Wii, with base terrain/shader support, port of terrain samples

and detail maps in shader.
Display List from tools reduces memory requirements
Improved resource utilization via VAT
Copy / Fast Copy
Mesh profiles
Floodgate – improves performance and reduces risk when porting from multi-core platforms.
Reduces MEM1 memory requirements
Saves memory with exports that use smaller data types
"

AThena engine for Wii(used in Cursed Mountain).- [wii.ign.com]
"
To deliver on these goals, the Sproing team relies on its proprietary "Athena" game engine, which is rendering the

Himalayas on Wii at a quality never seen before. The engine highlights of "Athena" include amongst others,

HDR-Rendering, shader simulations developed especially for Wii in order to display ice, heat and water (realistic

reflections and refractions), an ultra-fast particle system for amazing snow storms, soft particles for realistic

fog and smoke, depth of field, motion blur, dynamic soft shadows, spherical harmonics lighting, as well as a high

performance level-of-detail and streaming system in order to provide long viewing distance of the entire

surrounding. In order to create an exciting atmosphere when battling the ghosts, the game employs a number of

custom-created special effects such as the shader simulations as well as a newly developed post-processing

framework. "Our engine technology really takes the Wii hardware to its limits and Wii gamers can really look forward

to a heart-stopping, and breath-taking world that comes alive with this title," said Gerhard Seiler, Technical

Director of Sproing.
"


Shark 3D for Wii&trade(read about the Re-entrant Modular Shader System™).-
[www.spinor.com]
"
4. Empower your game designer: Live editing
One of Shark 3D’s core features is the tool pipeline, which supports live editing from the bottom up.
Edit your game items and NPCs in the Shark 3D Wii game editor,
edit your game scripts,
edit your textures in Photoshop,
edit your animations and geometry in Max or Maya,
edit your rendering effects in the Shark 3D Wii shader editor, ...
... and see and test all your changes live not only on your PC, but also on your Wii development kit!
Furthermore, supported by Shark 3D's live-live™ editing, developers can easily modify and customize the tool

pipeline for their particular needs. This makes game production for Wii more efficient and predictable,

significantly reducing production times and risks.


5. Empower your artists: Lighting, shadowing, reflections on Wii - easier than ever!
The node based shader editor empowers your designers to create their own Wii shader graphs.
Your designers can easily create, combine and customize lighting, shadowing, reflections, shader animations, color

operations, and many other effects.




Internally, the shader editor automatically compiles the Wii shader graphs into assemblings of Shark 3D renderer

modules and pipeline stages, powered by the Re-entrant Modular Shader

System™
(http://www.spinor.com/?select=tech/render#modular_shader_system). Of course, the designers can edit the

shaders interactively both on the Wii and on PC. This is especially useful when fine tuning the look on the Wii.



8. Many useful run-time features included
Run-time features of Shark 3D for Wii include the following:


Rendering
Efficient rendering: Off-line generation and Wii specific optimization of Wii display lists and Wii vertex data.
Efficient skinning: Off-line assembling of skinned meshes specifically for the Wii. At run-time, the CPU does not
touch any vertex, index or polygon data anymore.
Wii specific off-line conversion of textures for optimal performance and minimal memory usage.
A multi-layer animation system.
A character walking animation system.
Efficient culling of static and movable objects for rendering.
A shader system providing access to Wii specific render state settings.
A particle system.
Dynamic local lights.
Lighting management respecting Wii specific aspects.
GUI rendering.
TrueType Unicode text rendering.
Smart management of object state dependencies using delayed evaluation.
Smart attachment point management.
Multiple viewports with separate culling for multiplayer games.
And many other features.


Sound
Optimized Wii specific sound formats.
Run-time support for compressed Ogg Vorbis files.
Efficient culling of static and movable sound sources placed in the level.
Streaming sound (e.g. background music) from disk.
Wii specific off-line optimization and conversion of sounds for optimal performance and minimal memory usage.
Various other sound engine features.


Physics
Collision detection.
Physics.
Automatically generated sound for physics collisions and friction.
Automatic merging of nearby contact sounds.
Various constraints.
Generic path-based motors.
Various other physics features.


Game engine
Triggers and sensors.
Scripting.
Level management.
Many other game engine features.


Streaming
Background loading of levels without halting the rendering loop.
Background loading of levels parts and asset groups without halting the rendering loop.


Resource and memory management
Pack files.
Optimized low-level memory management.
Optimized for avoiding implicit memory allocations while a level is running. This avoids unexpected out-of-memory

errors while a level is running.
Many other features.

"



Another proof that ATI Hollywood was designed for GPGPU is Havok FX

Wii Havok FX
[www.gamasutra.com]

"
Product: Havok Supports Wii, Next-Gen At E3
by Jason Dobson
0 comments Share


May 15, 2006


During last week's E3 event in Los Angeles, cross-platform middleware physics solution provider Havok announced

support for the Wii, Nintendo's upcoming next-generation video game console platform.

The GameCube's current library of software titles feature more than fifteen games that utilize Havok middleware, and

this announcement confirms the continued support of Nintendo by Havok.

"Havok has become synonymous with state-of-the-art physics in games in recent years," said Ramin Ravanpey, Director

of Software Development Support, Nintendo of America. "With this announcement from Havok, we feel Wii developers

have another critical tool in their hands that helps unleash the real magic of the Wii platform."

In addition, Havok's software solutions were featured in 35 titles from 25 different developers at E3 across

multiple platforms, including Alan Wake, Alone in the Dark, Assassin, Auto Assault, BioShock, Brothers in Arms

Hell's Highway, Cars, Company of Heroes, Crackdown, Dawn of Mana, Dead Rising, Destroy All Humans! 2, F.E.A.R.,

Ghost Recon Advanced Warfighter, Happy Feet, Heavenly Sword, Hellgate: London, Just Cause, Killzone, Lost Planet:

Extreme Condition, Metal of Honor Allied Assault, Motor Storm, NBA Live 07, Over the Hedge, Saint's Row, Shadowrun,

Sonic The Hedgehog, Splinter Cell 4, Spore, Stranglehold, Superman Returns: The Videogame, Test Drive Unlimited, The

Ant Bully, The Godfather, and Urban Chaos: Riot Response.


"We wanted the action in our game to focus on interactive elements in a highly intuitive manner," says David Nadal,

Game Director at Eden Games. "We knew Havok Physics could help us do that for game-play elements, but we wanted to

push the envelope even further to add persistent effects that could interact with game-play elements. Havok's

GPU-accelerated physics effects middleware helped us achieve that in surprisingly little time."


Through the use of Havok FX and GPU technology, game developers are able to implement a range of physical effects

like debris, smoke, and fluids that add detail and believability to Havok’s physics system. Havok FX is

cross-platform, takes advantage of current and next-generation GPU technology, and utilizes the native power of

Shader Model 3 class graphics cards to deliver effect physics that integrate seamlessly with Havok’s physics

technology found in Havok Complete.


“With Havok FX we can explore new types of visual effects that add realism into Hellgate: London,” commented Tyler

Thompson, Technical Director, Flagship Studios. “Given the widespread installed base of GPUs and the incredible

performance of the new Nvidia GeForce 7900 boards, Havok FX was a natural choice."

"

And as you may know, Havok FX is designed only for physics in the gpu, while Havok Hydracore is the one designed for

cpu multicore.

Still doubts that Havok FX has relation with gpgpu, don´t worry.

[www.firingsquad.com]

"
..

FiringSquad: One of Havok's competitors' , AGEIA, has said of the ATI-Havok FX hardware set up, "Graphics processors

are designed for graphics. Physics is an entirely different environment. Why would you sacrifice graphics

performance for questionable physics? You’ll be hard pressed to find game developers who don’t want to use all the

graphics power they can get, thus leaving very little for anything else in that chip. " What is Havok's response to

this?

Jeff Yates: Well, I’m sure the AGEIA folks have heard about General Purpose GPU or “GP-GPU” initiatives that have

been around for years. The evolution of the GPU and the programmable shader technology that drives it have been

leading to this moment for quite some time. From our perspective, the time has arrived, and things are never going

to go backwards. So, if people are going to purchase extra hardware to do physics, why not purchase an extra GPU, or

better yet relegate last year’s GPU to physics, and get a brand new GPU for rendering? The fact is that this is not

stealing from the graphics – rather it gives the option of providing more horsepower to the graphics, or the

physics, or both – depending on what a particular game needs. I fail to see how that’s a bad thing. Not to mention

that downward pricing for “last year’s” GPUs are already feeding the market with physics-capable GPUs at the sub

$200 price point –even reaching the magic $100 price point.

FiringSquad: Finally is there anything else you want to say at this time about Havok and its plans?

Jeff Yates: Well we have so much good stuff coming – Havok FX is very exciting, but we are also preparing to launch

our latest character behavior tool and sdk: Havok Behavior. This is really the next step forward for blending

physics, animation, and behaviors, all with a configurable tool set that supports direct export and editing of

content from 3ds max, Maya, and XSI. And I forgot to mention next generation optimizations and Nintendo Wii support.

It is a lot, but it is all part of our big 4.0 release coming this summer. It is definitively an exciting time for

us.


"

Nintendo may not make use of shader model 3 or superior for Wii, since is a propiestary of Windows, but they can use

something like the shading language of opengl, for example, shading language version 1.3 = shader model 4.0.

[www.opengl.org]





I can compare the ATI Hollywood between an ATI Rv610 and an Rv630 since Hollywood, besides being capable of HDR+AA

like the ATI R520, according to one of the displacement mapping patents of nintendo(valid until 2020) says that the

GPU vertex cache and is able of doing fetching vertex texture, something available until the ATI HD 2000 series(ATI

R600 GPU´s) came up and those features are necessary for vertex displacement mapping, which does almost all the work

in the GPU for the displacement leaving the CPU to just provide the original position of the vertices as work(which

is nothin); in few words, this technique makes use of GPGPU algorithms, plus, ATI R600 gpu´s also were the first to

introduce a command processor.

In it I read something about fetching vertex and vertex cache, something that was available until the ATI´s HD 2000

series(ATI R600) for doing vertex displacement mapping.


Patent Storm
[www.patentstorm.us]

"
US Patent 6980218 - Method and apparatus for efficient generation of texture coordinate displacements for

implementing emboss-style bump mapping in a graphics rendering system

US Patent Issued on December 27, 2005
Estimated Patent Expiration Date: November 28, 2020
Estimated Expiration Date is calculated based on simple USPTO term provisions. It does not account for terminal

disclaimers, term adjustments, failure to pay maintenance fees, or other factors which might affect the term of a

patent.


Command processor 200 receives display commands from main processor 110 and parses them—obtaining any additional

data necessary to process them from shared memory 112. The command processor 200 provides a stream of vertex

commands to graphics pipeline 180 for 2D and/or 3D processing and rendering. Graphics pipeline 180 generates images

based on these commands. The resulting image information may be transferred to main memory 112 for access by display

controller/video interface unit 164—which displays the frame buffer output of pipeline 180 on display 56.

FIG. 5 is a logical flow diagram of graphics processor 154. Main processor 110 may store graphics command streams

210, display lists 212 and vertex arrays 214 in main memory 112, and pass pointers to command processor 200 via bus

interface 150. The main processor 110 stores graphics commands in one or more graphics first-in-first-out (FIFO)

buffers 210 it allocates in main memory 110. The command processor 200 fetches: command streams from main memory 112

via an on-chip FIFO memory buffer 216 that receives and buffers the graphics commands for synchronization/flow

control and load balancing, display lists 212 from main memory 112 via an on-chip call FIFO memory buffer 218, and

vertex attributes from the command stream and/or from vertex arrays 214 in main memory 112 via a vertex cache 220.

Command processor 200 performs command processing operations 200a that convert attribute types to floating point

format, and pass the resulting complete vertex polygon data to graphics pipeline 180 for rendering/rasterization. A

programmable memory arbitration circuitry 130 (see FIG. 4) arbitrates access to shared main memory 112 between

graphics pipeline 180, command processor 200 and display controller/video interface unit 164.

FIG. 4 shows that graphics pipeline 180 may include: a transform unit 300, a setup/rasterizer 400, a texture unit

500, a texture environment unit 600, and a pixel engine 700.

Transform unit 300 performs a variety of 2D and 3D transform and other operations 300a (see FIG. 5). Transform unit

300 may include one or more matrix memories 300b for storing matrices used in transformation processing 300a.

Transform unit 300 transforms incoming geometry per vertex from object space to screen space; and transforms

incoming texture coordinates and computes projective texture coordinates (300c). Transform unit 300 may also perform

polygon clipping/culling (300d). Lighting processing 300e also performed by transform unit 300b provides per vertex

lighting computations for up to eight independent lights in one example embodiment. As discussed herein in greater

detail, Transform unit 300 also performs texture coordinate generation (300c) for emboss-style bump mapping effects.

Setup/rasterizer 400 includes a setup unit which receives vertex data from transform unit 300 and sends triangle

setup information to one or more rasterizer units (400b) performing edge rasterization, texture coordinate

rasterization and color rasterization.

Texture unit 500 (which may include an on-chip texture memory (TMEM) 502) performs various tasks related to

texturing including for example: retrieving textures 504 from main memory 112, texture processing (500a) including,

for example, multi-texture handling, post-cache texture decompression, texture filtering, embossing, shadows and

lighting through the use of projective textures, and BLIT with alpha transparency and depth, bump map processing for

computing texture coordinate displacements for bump mapping, pseudo texture and texture tiling effects (500b), and

indirect texture processing (500c).

"



Plus, ATI Rv610 consumes very low power consumption, about from 25 to 36 watts, while the wii consumes around 53

watts according to the power supply of the Wii, plus, the GPU of the wii was is said to have costed around $30

dollars in cost production(this is not a launch price for cossumers), while an ATI Radeon HD 2400

pro(GPU+memory+etc) was released at a price aroun $59 dollars.


Wii GPU cost production $29.60.-
[www.reghardware.co.uk]
"
Nintendo said to profit on Wii production
Consoles costs under $160 to manufacture?

By Tony Smith • Get more from this author

18th December 2006 10:50 GMT

Nintendo may well be making so much profit on its Wii console it can well afford to replace broken Remote straps. According to a Japanese publication's assessment of the machine's innards, the console costs the videogames company less than $160 to assemble.

In a report published by Japanese business weekly the Toyo Keizai and relayed by Japanese-language site WiiInside, the most expensive component inside the Wii is the DVD drive, which, the paper estimates, costs $31. Next comes ATI's 'Hollywood' graphics chip, at $29.60. The IBM-designed and made CPU, 'Broadway', costs $13, the report reckons.

Add these items to the other components, roll in an assembly cost of $19.50 and you get a manufacturing cost of $158.30. The Toyo Keizai estimates that Nintendo's wholesale price is a cent less than $196. The consoles costs the consumer - if he or she can find one available to buy right now, of course - $250.

So, if the Toyo Keizai is right, Nintendo's making the best part of $40 for every console it sells, and by most accounts it's sold well over a million of them worldwide.

Don't forget, though, it has to cover the development cost, the money spent creating the Wii's on-board software, and the physical distribution and marketing costs, but it nonetheless establishes a nice pattern for Nintendo, which does very nicely out of games sales, whether Wii-specific titles or older ones made for previous consoles that the company can now sell again as downloads. ®

"

ATI Radeon HD 2400 pro price.-
[www.phoronix.com]
"
The Radeon HD 2400PRO has a stock operating frequency of 525MHz for its core and 400MHz for its video memory while

the Radeon HD 2400XT has its core running at 700MHz with the memory doubled at 800MHz. The prices for the low-end

2400PRO and 2400XT are $59 and $79 USD respectively. These ATI/AMD cards are of course competition for the NVIDIA

GeForce 8500GT.

"


ATI Radeon HD 2400 pro power consumption.-

[www.dailytech.com]
"
At the bottom of the new ATI Radeon 2000-series lineup is the ATI Radeon HD 2400 series with PRO and XT models. The

new ATI Radeon HD 2400 features 40 stream processors with four texture units and render backends. GPU clocks vary

between 525 MHz to 700 MHz.

The ATI Radeon HD 2400 series features less than half the transistors as the HD 2600 – 180 million. AMD has the ATI

Radeon HD 2400 series manufactured on the same 65nm process as the HD 2600. Power consumption of the ATI Radeon HD

2400 hovers around 25 watts.

"

A, and i also forgot, the ATI Rv610 has a die size of only 82mm2
[www.beyond3d.com]


while we know that ATI Hollywood has a die size of 94.5mm2(vegas) and 72mm2(napa). I have made some calculations

of the possible die size of the Wii GPU, taking into account things like the dsp, usb controllers, memory, and

stuff. I will save the calculations about the Wii memory and wii gpu die size for last, but for now I will tell you

that the Wii gpu may have at least like 80mm2, that of course taking into account that the 24MBytes of eDRAM are in

the napa die+ DSP, and the GPU and the 3MBytes of eDRAM are in vegas.



And on purpose, the ATI Radeon R600 series were also the first ones that introduced a command processor, which is

like a real processor; it was introduced in order to improve even more the gpgpu capabilities.

[www.tomshardware.com]

"
Command Processor (CP)
9:46 AM - May 14, 2007 by D. Polkowski

According to ATI, the command processor (CP) for the R600 series is a full processor. Not that it is an x86

processor but it is a full microcode processor. It has full access to memory, can perform math computations and can

do its own thinking. Theoretically, this means that the command processor could offload a work from the driver as it

can download microcode to execute real types of instructions.

The CP parses the command stream sent by the driver and does some of the thinking associated with the command

stream. It then synchronizes some of the elements inside of the chip and even validates some of the command stream.

Part of the pipeline was designed to do some of the validation, such as understanding what rendering mode state it

is currently in.

"The whole chip was designed to alleviate all of this work from the driver," Demers said. "In the past the drivers

have (even on the 500 series) typically checked and went 'okay, the program is asking for all of these things to be

set up and some of them are conflicting.'"

If there is a conflict for resources and states, the driver can turn off and think about what to do next. What you

need to remember is that the driver is running on the CPU. The inherent problem with the driver thinking is that it

is stealing cycles from the CPU, which could be doing other more valuable things. ATI claims to have moved almost

all of that work - or, more importantly - all of the validation work, down into the hardware.

The command processor does a lot of this work and the chip is "self aware." The architecture allows it to snoop

around to check what the other parts of the hardware are up to. An example of this is when the Z buffer checks in on

how the pixel shader is doing. It is looking to see if it can kill pixels early. If a resource knows what is

happening, it can switch to a mode that is the most compatible to accomplish its tasks.

Improvements utilizing this type of behavior can explain why consoles are faster in certain applications than on the

PC. PCs are always checking the state of the application. There is a lot of overhead associated with this. If the

application asks for a draw command to be sent with an associated state, the Microsoft runtime will check some of it

and the driver will check some of it. It then has to validate everything and finally send it down to the hardware.

ATI feels that this overhead can be so significant that it moved as much as it could into the hardware.

This is what is generally referred to as the small batch problem. Microsoft's David Blythe commented that

miscommunication over application requirements, differing processing styles and mismatches between the API and the

hardware were the largest complaints from developers. He concluded that his analysis "failed to show any significant

advantage in retaining fine grain changes on the remaining state, so we collected the fine grain state into larger,

related, immutable aggregates called state objects. This has the advantage of establishing an unambiguous model for

which pieces of state should and should not be independent, and reducing the number of API calls required to

substantially reconfigure the pipeline. This model provides a better match for the way we have observed applications

using the API." Lowering the number draw calls, setting of constants and other commands has made DX10 gain back this

overhead.

Further reductions from hardware advances can also add up to an additional reduction in driver overhead in the CPU.

This can be "as much as 30%" says ATI. That does not mean applications will run 30% faster, it simply means that the

typical overhead of a driver in the CPU will be decreased. Depending on the application, the CPU utilization can be

as low as 1% or up to 10-15%. On average this should be only 5-7% but a 30% reduction would translate into a few %

load change off the CPU. While this is not the Holy Grail for higher frame rates, it is a step in the right

direction. There should be a benefit for current DX9 applications and DX10 was designed to be friendly from a small

batch standpoint so it should have more benefits than DX9 in terms of CPU load reduction.

Next page.- [www.tomshardware.com]

[img.tomshardware.com]

The Setup Engine is located between the command processor and the ultra-threaded dispatch processor. Notice the

programmable tessellator as part of the setup engine?
Characteristically, ATI has had a typical primitive assembly setup / scan converter

because it had a separate vertex shader. You could consider the area before the vertex shader as the setup engine.

ATI now only has one shader waiting for all of the different data streams coming in. For this reason, everything

between the command processor and the shader core is the shader setup engine.

This engine will do at least three kinds of processing in order to get everything ready for the shaders. Once

finished, it will send it into the shader.

Vertex:
It will do all of the vertex assembly, tessellation, addresses of the vertices to be fetched, gets the indices and

does some math associated with it. The shader fetches vertices with the data that was computed by the setup unit and

then send it down for all of the vertex processing.

Geometry:
It is the same way with geometry. It will fetch all of the near-neighbor information and primitive information.

There is a lot of addressing computation done before sending it down to the shader for data computation.

Pixel:
This still does the classic pixel setup. That means getting ready for and the actual acts of scan conversion

(rasterization), generating the pixel data which will be sent to the shading through the interpolators. After that

the shaders (programs) will execute on the pixel shader.

next page.- Ultra-Threaded Dispatch Processor

next page.- Sequencing.. bla bla bla and -- A CPU can only process a few threads at a time, while graphics

processors can simultaneously accommodate thousands of threads. ATI has its workload set to 16 pixels per thread.

When you multiply that by the number of threads, it means that at any given moment there can be tens of thousands of

pixels in the shaders. With an abundance of work in a parallel manner, there should be no worries about masking

latency. That is the art of parallel processing and why GPUs more easily hit their theoretical numbers, compared to

CPUs.

next page and it continues

"



As you can see, the command processor in the Wii Hollywood would be the Starlet or the ARM9 as you prefer to call it.

best wii graphical games:
Zangeki no Reginleiv
Cursed Mountain
Dead Space Extraction
Resident Evil SarkSide Chronicles
Silent Hill Shattered Memories
Endless Ocean 2
GTI Club Super Mini Fiesta
etc.



If you wonder why if Wii has a very capable gpu then why it hasn´t been used prpoerly at least for doing vertex

displacment mapping?, that answer can be explained by Tim Swenney of Epic Games


Tim Sweeney: GPGPU Too Costly to Develop3:11 PM - August 14, 2009 by Kevin Parrish - source: Tom's Hardware US
[www.tomshardware.com]
"
Epic Games' chief executive officer Tim Sweeney recently spoke during the keynote presentation of the High

Performance Graphics 2009 conference, saying that it is "dramatically" more expensive for developers to create

software that relies on GPGPU (general purpose computing on graphics processing units) than those programs created

for CPUs.

He thus provides an example, saying that it costs "X" amount of money to develop an efficient single-threaded

algorithm for CPUs. To develop a multithreaded version, it will cost double the amount; three times the amount to

develop for the Cell/PlayStation 3, and a whopping ten times the amount for a current GPGPU version. He said that

developing anything over 2X is simply "uneconomic" for most software companies. To harness today's technology,

companies must lengthen development time and dump more money into the project, two factors that no company can

currently afford.

But according to X-bit Labs, Sweeney spent most of his speech preaching about the death of GPUs (graphics processing

units) in general, or at least in a sense as we know them today. This isn't the first time he predicted the

technology's demise: he offered his predictions of doom last year in this interview. Basically, the days of DirectX

and OpenGL are coming to a close.

“In the next generation we’ll write 100-percent of our rendering code in a real programming language--not DirectX,

not OpenGL, but a language like C++ or CUDA," he said last year. "A real programming language unconstrained by weird

API restrictions. Whether that runs on Nvidia hardware, Intel hardware or ATI hardware is really an independent

question. You could potentially run it on any hardware that's capable of running general-purpose code efficiently."

"


The best engine I know for Wii is the Gamebryo LighSpeed, since it includes Floodgate, which is an stream processing

tool like cuda or opencl, which can be used to unlock wii gpu true power, maybe even for vertex displacement

mapping; floodagte can run either in the cpu cores or in the GPU stream processors, vertex processors, etc, in the

case of the GPU, we must remember that gpgpu is applicable starting from ATI R520 technology and in Nvidia 8800.

[www.emergent.net]

"
A Real Wii Game Engine
Emergent was one of the first middleware vendors to support the Wii. Gamebryo makes it easy to build for multiple

and diverse hardware targets. Even if you're simultaneously developing for single-core and multi-core environments,

Gamebryo takes care of the heavy lifting, and you won't sacrifice power or control.

Gamebryo 2.6 establishes full tools capabilities for the Wii, including availability of the Terrain Builder and

Run-Time for this platform for the first time. Additional updates for Wii designers and programmers include:

Extended support for additional Wii texture formats
More efficient exports from artists’ tools through smaller data types that require less memory
Reduced MEM1 requirements for improved resource utilization
Highly optimized Floodgate™ stream processing tailored for the Wii
Gamebryo Leverages the Wii's Unique Architecture
Gamebryo for the Wii includes optimizations to the exporter pipeline, input mechanisms and rendering pipeline to

ensure that you get the most out of this popular console. Gamebryo also supports the Wii Remote™ controller, and

other standard Gamebryo features are mapped to the unique capabilities of this platform.


Cross-Platform Game Development With the Wii
Emergent’s Floodgate™ stream processing enginehas been made Wii-aware for cross-platform game development projects:

execution is now automatically optimized using the locked cache functionality, which reduces risk when porting

multiprocessor code. Projects created with Gamebryo for other platforms can now be easily re-targeted to the Wii.


"




The wii will be in the market for about 5 or even more than 8 years accordin to Satoru Iwata(he said that in 2009) since they wont launch

a new hardware while the current one can still be squeezed and until the current hardware starts to slow down is

when they will launch a new system.

Another proof that Wii hollywood was designed for GPGPU in mind is that it has been making use of Havok FX for

physics in the GPU.


[www.gamasutra.com]


If you still think that the Hollywood has a fixed pipeline and doesnt have shader support you should see this

thread:


[boards.ign.com]



Wii has an api called GX, resemblant to opengl not directx, you have to make use of something similar to the Shading

language, not shader model.



[www.levelup.com]



---OK, here goes the part that clculates the Wii Hollywood die size, as well as the memory size.

First af all read this article.- [gear.ign.com]

This article was made in June of 2006, a while later when ign reported the specs of Wii basing everything in the

sdks that some 3rd party developers had, an of course, those weren´t the final SDK.

"
According to the article, Nintendo had selected NEC Electronics to provide 90-nanometer CMOS-compatible embedded

DRAM technology for the forthcoming Wii console. The new LSI chips with eDRAM will be manufactured on NEC's

300-millimeter production line at Yamagata. While no clock speeds or really juicy info was included in the release,

the news does bring up a variety of interesting points.

There are also other interesting points here: like the mention of MIM2 structure on EDRAM implementations(since

third quarter of 2005) and the revelation that NEC selected MoSys as the DRAM macro design partner for the "Wii

devices," due to the fact that MoSys is quite familiar with implementing 1T-SRAM macros on NEC's eDRAM process.
"
There is also an image indicating the avaible macros from NEC according to the year, so, we can conclude by it that

Wii makes use of the UX6D macro made by NEC with mosys technology.

Now, NEC said 1T-SRAM macro from Mosys right, but by this is not stating the use of the first generation 1T-SRAM

technology, if we want to confirm which one is using we just have to either call Mosys or go through its homepage

and find it ourselves. There was a page named mosys available macros which listed the macros and the shrinking

technology, in which you could see that only 1T-MIM macros were availbale starting from 90nm technology, now the

page has gone since mosys did some changes to it´s homepage and i cannot longer find something like that table they

had prevously, neverthless, you should know that each new technology macro that mosys develops is cheaper, meaning

that using the original 1T-SRAM macro was not an option for wii since it was expensive and not even available since

90nm.

[www.eetimes.com]
"
MoSys Inc. and NEC Electronics officials confirmed Monday they will supply Nintendo Co. Ltd. with 1T-SRAM and eDRAM

technology, respectively, for the Wii game console.
MoSys and Nintendo have worked together for the last six years. Previous generations of MoSys' 1T-SRAM technology

were built into the GameCube. The newest 1T-SRAM technology embedded in the Wii console use NEC Electronics'

90-namometer CMOS-compatible embedded DRAM (eDRAM) process technology.

NEC said it selected MoSys as the DRAM macro design partner for the Wii because the company has experience in

implementing 1T-SRAM macros on NEC's eDRAM process

"

It says the newest 1T-SRAM technology, and as we know, the newest technology from mosys back then was the 1T-SRAM

MIM, which is like 1T-SRAM-Q but using mim, nevertless, that for the case of TSMC partnership, NEC had MIM-2

technology, which nintendo considered to be the best deal.



OK, lest check these:

Nec Electronics MIM2:

1.- [www.eu.necel.com]

2.- [www.necel.com]

3.- [www.necelam.com]

4.- [www.eu.necel.com]
[www.necel.com]


1.- En the first article we can view the Key Technologies for eDRAM Cells table, and we can see that only

UX6D(CB-90)is mentioned at 90nm with 1Mb in 0.22mm2(I did further information and discovered that this size includes

the overhead) which is the same as saying 1 bit per 0.22um2.

Here you can check 1T-SRAM previous technologies: [en.wikipedia.org]. You may be wandering way

there are two sizes like: 1T-SRAM-Q bit cell 0.28mm2 and 1T-SRAM-Q 0.55mm2 with overhead and also that if both can

hold 1Mbit. I found the answer in this other article from a conference that mosys did to talk about the evolution of

their macro designs:


[www.bjic.org.cn]




As you can see, the bit cell must be referring to just the capacitor size(since it only mentions it´s

dimensions) and the macro size must be the whole circuit(due that it mentions it´s dimensions and the Mbits in the

format Xmm2/Mb); but well, this should have been clear in the IGN link report. In
the table with the macro release list starting from 2002 to 2007.

**** wii macro memory revealed *************
UX6D is a macro based on 1T-SRAM-Q at 90nm and by replacing the FAC for NEC´s MIM2 technology. Since the

only kind of eDRAM available at 90nm is UX6D and uses MIM-2(dielectric material is ZrO2) and not MIM(dielectric

material is Ta2O5) from its previous generation at 130nm; what IGN reported it´s true, even if they don´t mention

the use of UX6D as macro design on the wii explicitly(they rather preferred to expose it in the Nec´s macro release

table available in their report).

Don´t forget to read the Material Advantages section in the link, since I have not figure out yet what they

mean by "ZrO2's high dielectric constant"(does this mean that it has almost no refreshing compared to normal SRAM?).


So if both NEC and TSMC use their own technology of MIM structure, and as you saw in the previous articles

that 1T-MIM is a derivate of 1TQ macro(0.55mm2/Mbit) measuring between 0.21mm/Mbit to 0.23mm2/Mb, then UX6D at

0.22mm2 is the full macro that can contain 1Mb; of course that this may change a little depending on which of the

three types of CB-90(for in the UX6D context) is being used and the MHz;

You are still doubting, not to worry, just read this and even an exceptical person will no doubt that all

what I have said until now is fact.: [www.necel.com]

Once you read it you find out that the article is like suggesting what 'vegas' could have, but it uses as

an example the flipper of the GameCube,
it also suggests that it had 24Mbits, but that today´s UX6D could easily replace the half die of the Flipper with

256Mbits(32MB). Flipper´s die was about 110mm2, so in 55mm2 you can store 32MB.

Remember that UX6D is 0.22mm2 for 1MB. Then let´s do the next:

1/0.22 = 4.545454Mb in 1mm2 4.545454/8 = 0.5681MB/mm2

1mm2/x = 32MB/55mm2 = 0.58181MB/mm2

So isn´t 0.58181MB/mm2 = 0.5681MB/mm2 approximately. And We are suggesting 110mm2 die for flipper, but if

in reality was 112mm2, that would give us more aproximisation.

Lets not forget that the die size of the memory is being taking into account that the speed is about 200 to 300mhz,

since I am not sure how much the speed can affect the die size, besides, there is no information about it. Check

CB-90M for more information.



Ok then, now how much die size could 24MBytes of eDRAM occupy?

1mm2 = 0.56181818 MBytes
42.72mm2 = 24MBytes


and for 3MBytes

1mm2 = 0.56181818 MBytes
5.4mm2 = 3MBytes


Ok, remember that those are aproximations, plus, I am not sure if there are just 3MBytes of eDRAM in vegas since ubi

once said that he compared hollywood between an ATI x1400 and x1600 and that wii had 6MBytes, 2 of framebuffer and z

buffer and 4 for texture cache, but thats another story.-

[newsgroups.derkeiler.com]


Ok, that´s about the memory, but wat about things like the usb controller, the ARM9, etc?

the ARM926EJ-S of Hollywood is about 1.45mm2 according to.-

[www.tensilica.com]

As for the usb controllers, sd controllers, etc, I can´t imagine them occupying a lot room, since even ARM7 just

occupy 0.47mm2, I did a little further investigation and ended up conlcuding that Wii hollywood micht have an ATI

Imageon embedded, which is the ARM9+stuff, and that they could occupy from 3 to 5mm2 according to complexity design.


so, let´s say that the starlet+components ocuppy at least 5mm2, lest´s add that the 5.4mm2 of 3MBytes of eDRAM

now, the vegas has 94.5mm2 of die
94.5-10.4 = 84.5mm2

napa is supposed to have the 24MBytes of eDRAM + DSP from what I read in Wiibrew


Well, hope this information will be useful





--- Wii 720p

It is well known that wii can output only 480p right now, but something interesting happened to me while I was

testing the Mplayer CE or the Geexbox, I don´t remember right now which one was, but well, I tried a 720p movie like

DBZ: Broly second coming, and i cleraly saw that the movie was being recognized, but of course the framerate was

extremly low, like 5 to 10 fps, i also noticed that the resolution was a lot clear and that probably was running at

720p, but maybe it was being downscaled, not sure but for downscaling I must say that the image quality was really

good.

Ok, then, the thing is that if the problem with current applications like mplayer ce is that 720p requires to much

power from the cpu, there are ways to leave some of the work to the gpu so that most of the cpu work can be

uffloaded to the gpu, in few words, make use of gpgpu software.

I have read things like gpgpu for h.264 and other thngs.-
[developer.amd.com]).

pdf


If there is a way to use opencl, cuda, ati stream or Floodgate for wii homebrew, it would be interesting how much

better results we sould get from now on, Nvidia right now is working with an adobe flash player that makes use of

gpgpu.


On purpose, as for the wii dolphin emulator, you should know that games like dead psace extraction require a lot of hardware,

like dual core 3 ghz amd athlon, 4 gigabytes of ram, a gpu starting from the HD 2000 series and to put it even

further, the use of opencl for the framerate slowdowns and other things, and even with that the game runs at just 15

to 20 fps, this most be most probably that the game makes use of lot of physics.

The gamecube to be emulated only requires
cpu more than 1 ghz recommended
ati radeon 9500 or superior
256MBytes ram

do you see the clear difference?

I only did the investigation abpout the Wii/gamecube dolphin emulator to show an example that Wii is totally different architecture to Gamecube, I am not promoting it´s use.



It would be interesting to see the code of games like

Zangeki Reginleiv
Monster hunter 3
Resident Evil DarkSide Chronicles
Cursed Mountain
Dead Space Extraction


Just hope that in future proyects we will be making use of stream processing engines like Opencl or CUDA or ATI Stream, the only detail is that it is unknown if Wii GPU will be compatible for example with Opencl, since opencl doesn´t support all video cards even though was intended to run in any GPU that was designed for gpgpu in mind, but some ATI cards and nvidia designed for that are not supported by now.
Re: Wii GPU is similar to ATI Rv610
March 12, 2010 09:41PM
I hear ya man... I think the only homebrew out there that has taken full advantage of the wii graphics package is this one here.
Re: Wii GPU is similar to ATI Rv610
March 12, 2010 10:03PM
If we could somehow take profit of Wii´s gpu for gpgpu, we could improve the emulators and the mplayer, who knows, even making 720p true in Wii.

With gpgpu you can obtain between 100 to 250x more performance than cpu, since gpgpu takes adventage of the stream processors or vertex processors of the gpu to run threads in parallel, and starting from ATI R520, there are supposed to be at least 40 vertex units or sttream processors, and can run about 512 threads.


For now, I just want to make contact with bushin, anyone hear can help me with that?

Does somebody here has his e-mail?







Zangeki no Reginleiv
Monster hunter 3
Resident Evil DarkSide Chronicles
Cursed Mountain
Dead Space Extraction



Edited 1 time(s). Last edit at 03/12/2010 11:31PM by tapionvslink.
Re: Wii GPU is similar to ATI Rv610
March 12, 2010 11:05PM
@taiponvslink: You could sign into the irc channel #wiidev on efnet. He usually hangs around there.
Re: Wii GPU is similar to ATI Rv610
March 12, 2010 11:58PM
I see

Do you think I can see him logged this saturday?

and at what hour will he be there?
Re: Wii GPU is similar to ATI Rv610
March 13, 2010 12:21AM
let me check with his secretary... come on man
Re: Wii GPU is similar to ATI Rv610
March 13, 2010 12:50AM
Your post was much too long to read.

The Wii's GPU is very likely cut down, and there is no conclusive evidence that it supports GPGPU. The article mentioning Havok FX does not say that the Wii will support it: it only says that the core Havok system (not the GPU-optimized FX system) will support the Wii. Havok FX never came out anyway. Somebody would have done something with GPGPU already if we could have.

As for 720p, the Wii's video output circuitry does not support it. It's not a power limitation; the simple reason that 720p is not supported is that the Wii cannot output any video signal greater than 576p.

Sorry to crush your dreams.

About bushing: just wait for him to get on. Nobody knows when he will. He will also probably give you a better explanation than I can as to why the Wii can't do this.
Re: Wii GPU is similar to ATI Rv610
March 13, 2010 01:51AM
There many proves of ATI Hollywood capable of gpgpu

1.- Can do HDR+antialaising in realtime graphics like Monster Hunter 3, and for that you need at least ATI R520, due to the architecture and technology used, ATI made it not basing it in power, but in skills.

2.- Havok FX is used for physics in the gpu instead of the cpu, and there is an article there of firingsquad where when asked about ageia competition, those of the Havok FX mentioned that using gpgpu would be better for physic, using the gpu, than the processor of aegia.

3.- Wii supports Floodgate, which is an stream processing engine like CUDA and ATI Stream, how is an stream processing engine going to run on a single core CPU?, Floodgate can run either in the cpu cores or in the gpu ones(stream processors), and there is no benefit of using Floodgate without multicore environment, there is also this.- [beautifulpixels.blogspot.com]


"
Getting into consoles, we start with the Wii. There are two types of memory, both accessible by CPU and GPU. However, what's really interesting is the ability to lock a portion of the L1 cache and explicitly manage it with DMA transfers. In one test case, we saw 2.5 times performance improvement by explicitly managing Floodgate transfers with the locked cache!
"

2.5x performance, can that be done in a single core cpu?, not to mention that that was just a simple experiment, not done properly to ahieve even more.

you see that, those are DMA transfers, those are used so that the instructions of the cpu are passed to the main memory of the GPU, this is something common for gpgpu purpose since even though the GPU can do some general purpose things, it cannot things without command or instructions of what to do

DMA transfer
[en.wikipedia.org]

"
Direct memory access (DMA) is a feature of modern computers and microprocessors that allows certain hardware subsystems within the computer to access system memory for reading and/or writing independently of the central processing unit. Many hardware systems use DMA including disk drive controllers, graphics cards, network cards and sound cards. DMA is also used for intra-chip data transfer in multi-core processors, especially in multiprocessor system-on-chips, where its processing element is equipped with a local memory (often called scratchpad memory) and DMA is used for transferring data between the local memory and the main memory. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without a DMA channel. Similarly a processing element inside a multi-core processor can transfer data to and from its local memory without occupying its processor time and allowing computation and data transfer concurrency.

Without DMA, using programmed input/output (PIO) mode for communication with peripheral devices, or load/store instructions in the case of multicore chips, the CPU is typically fully occupied for the entire duration of the read or write operation, and is thus unavailable to perform other work. With DMA, the CPU would initiate the transfer, do other operations while the transfer is in progress, and receive an interrupt from the DMA controller once the operation has been done. This is especially useful in real-time computing applications where not stalling behind concurrent operations is critical. Another and related application area is various forms of stream processing where it is essential to have data processing and transfer in parallel, in order to achieve sufficient throughput.


"
Want to know more about DMA transfers?
[electronicdesign.com]
"
Now that a kernel is written in C for CUDA, it must be called from the host CPU. First, allocate memory on the GPU device and copy the array data from the host RAM to the GPU device memory. This is done using the cudaMalloc(…) and cudaMemcpy(…) commands (see “CUDA Setup Code For Jacobi Relaxation”). Note that cudaMalloc(…) returns a pointer to an address in device memory. It also can be used in kernel calls that will access that device memory. However, it can’t be accessed from the host code, because the address it points to is not in the host system RAM, but in device RAM on the GPU.

Via the PCI-e bus, cudaMemcpy(…) initiates a DMA transfer from host RAM to device RAM, with the last parameter controlling direction of the transfer between host and device memory.


"
the host RAM is the main RAM(mem1 in Wii´s case) and the device RAM is the local memory of the GPU(MEM2 in Wii´s case)



the article also mentions that Gamebryo laverages Wii´s unique architecture, so what´s so unique about it?


4.- Let´s not forget that ATI Hollywood was finished somewhere after june of 2006 when NEC said that he would provide eDRAM for Wii, so why the Hollywood GPU took so much time to be done?, just look at the displacement mapping patent and it clearly mentions that the GPU will support a command processor, something available until the ATI R600 series and in the Hollywood case would be the ARM9, and that it can do vertex texture fetch, something that also was avaible until ATI R600; besides, even the Shark 3D engine for Wii clearly says that the rendering pipeline of Hollywood supportx vertex data.

Wii has done normall mapping in games like Dewey´s Adventure, and in Zanegki I am not so sure if I am looking at normall mapping or maybe displacement mapping due that some places like the villages and castles look really goos, dam, even the characters look good, and there like 50 enemies on screen plus the goood characters and villagers, and all that in real time.(check gametrailers videos of Zangeki)

Just look at games like Zangeki Reginleiv in gametrailers, or Xenoblade, Monster Hunter 3, Cursed Mountain, Endless Ocean 2, Resident Evil Dark Side Chronicles.Loook not just the graphics but the physics.


5.- I forgot, Wii also supports Havok Destruction and Havok clothe, and those tools work with Opencl.


6.- Remember that the director of RES5 said first couldn´t handle RES5 screen presentation, and that later after the RES Dark Side Chronicles, just by looking at the graphics and ilumination details he now considered RES5 in Wii possible, combine that with the fact that CAPCOM later revealed that would adap it´s MT Framework(engine that works for multicore environments normally cpu cores) to Wii(for that would have to be gpgpu since gpus like the ATI Rv610 have at least 40 stream processors ). Or what Factor 5 said about their engine for Wii? Two coincidences, that´s too much


and remember

HDR+AA cannot be done in an architecture less than an ATI R520, I even go that far that if you prove me that HDR+antialaising has been done on the Xbox 1 or Gamecube, then I back off all what I have said, true HDR+antialaising, not simulated like Twilight Princess.


I know that bushin knows things that I know, but I also know things he doesn´t, don´t forget that evreybody is ignorant bu not everybody ignores the same.

I have been doing investigation in gpgpu and Wii Hollywood and games and articles and all for about 1 year, I am sure of this, the problem is not the hardware, is the software, I have gone so far studying Opencl by myself, but the documentation is really complex and there are not many examples to practice, plus, we would have to look for a way to make it compatible with Wii, maybe a driver like how it was done with cuda running on an ATI Radeon.


Check these games
Zangeki no Reginleiv
Cursed Mountain
dead Space Extraction
RES evil Dark SC
Silent Hill
Endless ocean 2
Xenoblade
GTI Club mini fiesta(the wii version images can be found in DSWii, i tell you this because you will most likely find the psp version ones)



Edited 2 time(s). Last edit at 03/13/2010 02:02AM by tapionvslink.
Re: Wii GPU is similar to ATI Rv610
March 13, 2010 05:32AM
EDIT: Okay, so I looked and the GPU does have a command processor. Now we need to see if we can use it for anything without having to give up graphics/it's a legitimate command processor like you think it is/it doesn't suck.

From what I know, nobody uses it, which either means that you're wrong about it doing anything or people have never felt like it.



Edited 4 time(s). Last edit at 03/13/2010 05:41AM by WikiFSX.
Re: Wii GPU is similar to ATI Rv610
March 13, 2010 04:24PM
I am not sure what GPGPU really means to you or what you believe it would brings for Wii programming but I'm pretty sure the Wii/GC GPU is already a very well known picece of hardware since official leaked documentation is very detailled about what you can do and what you can not do and quite everything is already handled by libogc for years (including hardware antialiasing and the Command Processor, these are no mysteries). There are also nintendo patents that are more trustable than speculation over release dates or chip similarities.

Basically, all the programmer need to know is probably already known, it's just that most homebrew do not use every features because they don't need to, that's all. Some does use advanced GX programming techniques (Wii64 for example) but the average user won't know about that, it's not as if saying that Wii can do GPGPU is going to revolutionarize homebrew, because it's probably already used without you even noticed...

On a side note, quoting highly technical articles without really explaining (and understanding) them never made any "investigations" look more serious ;-)
Re: Wii GPU is similar to ATI Rv610
March 13, 2010 09:18PM
"GPGPU" means general purpose computing on GPUs. It's possible on most everything that does graphic processing (even framebuffers), but only some GPUs are actually designed for it. Wii64 does not use this.
Re: Wii GPU is similar to ATI Rv610
March 14, 2010 01:47AM
Quote
WikiFSX
"GPGPU" means general purpose computing on GPUs. It's possible on most everything that does graphic processing (even framebuffers), but only some GPUs are actually designed for it. Wii64 does not use this.


Now you get it, GPGPU is for doing stuff thaa would normally would be done in a multicore processor, in 2005 ATI launched their first compatible GPU´s with this technology, those were the ATI R520, and they were also the first ATI Radeons that supported HDR+AA.

You may know that normally a GPU would have vertex shaders, these processors can be used for gpgpu algorithms, and normally a GPU like an ATI R520 would normally have 40 of them, just imagine how manny kernels can be runned in parallel, from what I have read is possible to run about 512 threads in an ATI R520.

Remeber that when Wii was launched many programmers complained that Wii had no shaders?, is not that it doesn´t have shaders since programmable shaders can be done in it and the proof is that it has done HDR in many games lately(HDR cannot be done with fixed function like Flipper), the thing is that ATI Hollywood has stream processors, through them you can run programmable shaders, but due to complexity of the design engines like Shark 3d have appeared, the engine does the abstraction for you and the programmer just writes the shader programs as he would normally would have to do them, Wii makes use of an API called GX, very similar to opengl. I suggest you read about the shark 3d for wii and specially the part of "Re-entrant Modular Shader System" to understand more about shaders in Wii

Stream processor
[www.brighthub.com]
"
Stream Processer

Stream processors are a relatively new technology to be introduced to graphics cards. Essentially, stream processors
can be allocated different processes to perform depending on what graphical environment is to be generated. For example, in indoor scenes the stream processors can be set as shaders, while in outdoor scenes the stream processors can be used to map vertices.
Stream processors are commonly used in the newer generations of graphics cards, replacing dedicated vertex shaders and pixel pipelines.

The above are the main elements which make up the graphics processing power of your graphics card. There are, however, other areas of the card which can be set and controlled using software such as antialiasing and anisotropic filtering. Depending on the game you are playing, and the more powerful the graphics card you have, the higher these additional factors can be pushed.

Read more: [www.brighthub.com]

"


This is an example of the advebtage of a gpu over a cpu if the gpu is used for general purpose



I have read that an ATI HD 2600 xt(with 120 stream processors) can do theorically 192 gflops
[www.amd.com]

Now then, The ATI Hollywood would normally have 40 stream processors like the HD 2400 pro(Rv610), I am not sure if that means that it would have 3 times less power than the 2600 xt, since the power also depends on how well your prograsmms take adventage of the parallelism, and the memory latency plays a good factor too.

Now then, to use a gpu for general purpose is not as easy as it sounds, since you not only have to make use of a new way of programming like Opencl, another problem is that is more complexed and would take more time to write the programs, not to mntion that we would have to look for a way to implement opencl in wii, opencl version 1 is compatible with some video cards designed for gpgpu, so making something like a driver or sometjhing to make it work would have to be done; I would hope that ATI stream would work in Wii, but knowing nintendo, I bet that has done some arrengementd to not make it directly compatible.


The reason Why nintendo decided to delay Wii´s GPU and designed it for GPGPU is that that wey they would have a very skillfull GPU without the need of a lot of power and of course so that the GPU would be way cheaper, just like the ATI Rv610 that can be found in the ATI Radeon HD 2400 pro and was priced at just $59 dollars when launched, while the Hollywood had a cost production of $30 dollars(don´t know wgich would be it´s price but I imagine that could be like $50 dollars)


well, now i will check efnet to see if i have luck



Edited 1 time(s). Last edit at 03/14/2010 01:57AM by tapionvslink.
Re: Wii GPU is similar to ATI Rv610
March 14, 2010 05:33AM
you're talking nonsense. go annoy other people.
Re: Wii GPU is similar to ATI Rv610
March 14, 2010 05:38AM
In essence you're a really dedicated troll. You're throwing inane amounts of information you can't even comprehend trying to make something impossible look at least improbable.

Please go away kthnx.
Sorry, you do not have permission to post/reply in this forum.