First screenshot of our FPS game

04.01.2010 02:14 in 3D graphics, apps, FPSGame

fps_small.jpg
click to visit gamedev.pl

Initially I didn't want to publish any screens from this FPS game, but sometimes you need a little motivation to keep things running. Especially when after few months the team still doesn't have game ready.

This is pretty old screen, beginning of October 2009. Its basically dynamic-only lighting (but no deferred lighting at that time) with SSAO. HUD was added as separate layer (we were testing a lot of different HUDs at that time). Map was created in Valve Hammer and finetuned in Blender.

My simple raytracer is 10x faster than NVidia OptiX!

21.10.2009 05:20 in raytracing, 3D graphics

rt-dabroz.jpgrt-nvidia.jpg

It's frustrating. I coded this sample about month ago. But the very next day I've upgraded my system to 64-bit Windows 7 and my raytracer magically refused to work. Finally I've got some time to fix it and post.

Basically it's the same demo that NVidia's using to promote OptiX. First of all, you can run OptiX samples on GeForce cards (not only QuadroFX) by disabling video card name check (very sneaky, NVidia!). So did I. And NVidia sample was running at 3-4 FPS, but it needs a lot of frames to produce final results (it's simillar to progressive JPEG). On the other hand, my very simple raytracer coded in CUDA easily hits VSync (60 FPS) using higher resolution and instant multisampling. Sure, my raytraces is not pixel accurate with OptiX, but its a matter of setting lights, objects positions and other stuff. And why does OptiX perform so bad? I have no idea.

You will probably have a lot of problem running these demos (I have no luck with release builds). But if you have Visual Studio and CUDA SDK you have chances. x86 build runs fine on 32bit Windows XP. x64 build runs on 64-bit Windows 7 (and probably Vista). Visuals in x64 versions are somewhat different from image in my post, because I needed to backport some code.

PS. I love 7 driver model. NVidia drivers (191.07 in my case) are still very buggy (or rather unstable) when running CUDA code, and automatic driver restart is much better than total system crash.

PS2. I'll try to find some more time and finally post my OpenGL 3.2 multisample demo source code.

What is and why do we need explicit_multisample? (or how to do real antialiasing in deferred shading)

16.09.2009 12:08 in 3D graphics, OpenGL

Deferred shading has lately become extremely popular. I’m not huge fan of it, but depending on typical scene in game (preferably indoor, lot of lights) it can be a great advantage. However, antialiasing is a real pain in DS case. Most gamed involved edge filter combined with blur, but the result is visually horrible (especially in low resolutions, where AA is a must). But why can’t we use multisample (MSAA/CSAA) with deferred shading?

Let’s see how multisample works. Up to now, we:

  • render the scene
  • downsample AA buffer to texture
  • render full-screen quad with texture (and probably some postprocess)

This of course won’t do the thing right with deferred shading. Why? Because it will downsample each G-buffer individually. See following picture.

p1.png

We have 4 pixels, 4 samples each (I won’t go into multisample details, let’s keep it simple) - a normal vector is stored in each sample. We downsample AA buffer and poof! Normals have gone wrong. Everything else will follow the same routine, so at edges we will have blurred normals/diffuse values and other data. Using AA will probably only boost visual artifacts.

But, OpenGL 3.0 and DirectX 10 has a new feature which is called explicit multisample (or custom resolve). It allows us to access each sample in multisample buffer. In this scenario, we don’t downsample AA buffer - we use it like a texture, so in lighting shader we have access to every normal/diffuse, and our computations look like the second picture.

p2.png

And we still benefit from multisampling (instead of supersampling). Time for some C++.

What do we need to do to upgrade our rendering? First, buffers creating:

glGenTextures(1, &tex);
glBindTexture(GL_TEXTURE_RENDERBUFFER_NV, tex);
glGenRenderbuffers(1, &buffer);
glBindRenderbuffer(GL_RENDERBUFFER, buffer);
glRenderbufferStorageMultisample(GL_RENDERBUFFER, 8, GL_RGBA32F, 1024, 768);
glTexRenderbuffer(GL_TEXTURE_RENDERBUFFER_NV, buffer);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, buffer);

And then, binding texture for FSQ:

glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_RENDERBUFFER_NV, tex);
glTexRenderbuffer(GL_TEXTURE_RENDERBUFFER_NV, buffer);
glUniform1i(sampler, 0);

Finally, let’s fix shader code. Assume we have following code:

#version 150
sampler2D sampler_diffuse, sampler_position, sampler_normal;
in vec2 texcoord; // [0,1]x[0,1]
out vec4 result;
vec4 compute_lighting(vec3 diffuse, vec3 position, vec3 normal)
{
  ...
}
void main()
{
  vec3 diffuse = texture2D(sampler_diffuse, texcoord).rgb;
  vec3 position = texture2D(sampler_position, texcoord).xyz;
  vec3 normal = texture2D(sampler_normal, texcoord).xyz;
  result = compute_lighting(diffuse, position, normal);
} 

We upgrade it to:

#version 150 
#extension GL_EXT_gpu_shader4 : enable
#extension GL_NV_explicit_multisample : enable
samplerRenderbuffer sampler_diffuse, sampler_position, sampler_normal;
in vec2 texcoord; // [0,1]x[0,1]
out vec4 result;
vec4 compute_lighting(vec3 diffuse, vec3 position, vec3 normal)
{
  ...
}
void main()
{
  const int samples = 8;
  result = vec4(0); 
  ivec2 texcoord2 = ivec2(textureSizeRenderbuffer(sampler_diffuse) * texcoord);
  for (int i = 0; i < samples; i++)
  {
    // AA renderbuffers are addressed with integers
    vec3 diffuse = texelFetchRenderbuffer(sampler_diffuse, texcoord2, i).rgb;
    vec3 position = texelFetchRenderbuffer(sampler_position, texcoord2, i).xyz;
    vec3 normal = texelFetchRenderbuffer(sampler_normal, texcoord2, i).xyz;
    result += compute_lighting(diffuse, position, normal);
  }
  result /= (float)samples;
} 

That’s it! There are various impovements we can do. For example, if we use shadow mapping, we can calculate shadow term per-pixel and then apply it to all samples. And we must hope that ATI would implement OpenGL 3.2 (and explicit multisample) soon.

Update: there is ARB_texture_multisample (now part of OpenGL core) that should do the same thing and be more portable. I'm going to check differences between this and nv_explicit_multisample soon!