tinySceneGraph



Partially Resident Textures - Reloaded

Introduction

kidney volume rendering AMD was the first to make sparse textures available via the vendor specific OpenGL extension AMD_sparse_texture, back in 2012. tinyScenegraph adapted this extension to support 2D mega-textures via special scenegraph nodes, allowing virtual 2D textures with resolutions up to 512k x 512k. - find out more about csgSparseTexture here.

Meanwhile, the OpenGL architecture review board (ARB) adapted AMDs work and created a multi-vendor version GL_ARB_sparse_texture, which is also supported on recent nVidia hardware/drivers. However, while the AMD extension covered everything from the host API calls to texel fetches in a GLSL shader, the ARB version omits any GLSL specifics and focuses on the host API only. Sparse textures may be sampled with regular GLSL functions, but fail silently if texels are accessed that are not backend by physical memory. To detect uncommitted texels with the ARB version in GLSL, the EXT_sparse_texture2 extension is required as well. Unfortunately, this extension is hard to find as of may 2015.

Recently, the csgTexture3D node learned a more generic API to support vast data. It builds on top of the ARB extension to support nVidia hardware as well. Lets take a look at the ARB extension and the differences to it's AMD predecessor:

Both extensions work with memory pages, allocating blocks of texels. The specific page size (and thus the block dimension) depends on the texture format and target. These sizes can be queried like this:
ARB AMD
std::vector<GLint> pageSizesX;
GLint              numPageLayoutsAvail;
 
glGetInternalformativ(GL_TEXTURE_3D, GL_RGBA, 
                      GL_NUM_VIRTUAL_PAGE_SIZES_ARB, 
                      1, &numPageLayoutsAvail);

pageSizesX.resize(numPageLayoutsAvail);
glGetInternalformativ(GL_TEXTURE_3D, GL_RGBA, 
                      GL_VIRTUAL_PAGE_SIZE_X_ARB, 
                      numPageLayoutsAvail*sizeof(GLint), 
                      &(pageSizesX[0]));
// same for y/z, respectively
...

// Now select the layout 0..numPageLayoutsAvail-1 
// you want to use:
glTexParameteri (GL_TEXTURE_3D, 
                 GL_VIRTUAL_PAGE_SIZE_INDEX_ARB, 0);
GLint pageSizeX, pageSizeY, pageSizeZ;
glGetInternalformativ(GL_TEXTURE_2D, GL_R8, 
                      GL_VIRTUAL_PAGE_SIZE_X_AMD, 
                      sizeof(GLint), &pageSizeX);
glGetInternalformativ(GL_TEXTURE_2D, GL_R8, 
                      GL_VIRTUAL_PAGE_SIZE_Y_AMD, 
                      sizeof(GLint), &pageSizeY);
glGetInternalformativ(GL_TEXTURE_2D, GL_R8, 
                      GL_VIRTUAL_PAGE_SIZE_Z_AMD, 
                      sizeof(GLint), &pageSizeZ);
 
Note that the ARB extension potentially handles different page layouts wrt width/height/depth trade-offs per block. Thus, the page size query may return more than just one size per dimension. In reality, I have not yet seen more than one layout.

Once the page sizes are known, the overall texture size can be requested and memory can be allocated/committed for blocks of texels. Both offset and size of blocks to be (de)committed have to be a multiple of the block dimensions The API again slightly differs between the AMD- and the ARB-version:
ARB AMD
// Allocate a sparse 3D texture:
glBindTexture(GL_TEXTURE_3D, glID);

// Set sparse texture flag and page layout 
// (do *before* glTexStore()):
glTexParameteri (GL_TEXTURE_3D, 
                 GL_VIRTUAL_PAGE_SIZE_INDEX_ARB, 0);
glTexParameteri (GL_TEXTURE_3D, 
                 GL_TEXTURE_SPARSE_ARB, GL_TRUE);
glTexParameteri (GL_TEXTURE_3D, 
                 GL_TEXTURE_IMMUTABLE_FORMAT, GL_TRUE);

// Specify overall storage requirements (but dont 
// commit physical memory store, yet):
glTexStorage3D (GL_TEXTURE_3D, numMipLevels, GL_RGBA, 
               texWidth, texHeight, texDepth);

// Commit (=allocate physical memory) a page:
glTexPageCommitmentARB(target, mipLevel, 
                       offsetX,offsetY,offsetZ, 
                       width, height, depth, GL_TRUE);

// Submit data to committed tiles:
glTexSubImage3D (GL_TEXTURE_2D_ARRAY, mipLevel, 
                 offsetX,offsetY,offsetZ, 
                 tileWidth, tileHeight,tileDepth, 
                 GL_RGBA, GL_UNSIGNED_BYTE, data);

// Deallocate tile again:
glTexPageCommitmentARB(target, mipLevel, 
                       offsetX,offsetY,offsetZ, 
                       width, height, depth, GL_FALSE);
// Allocate a sparse 2D array texture of size 
// layerWidth x layerHeight, containing numLayers images:
// Specify overall storage requirements (but dont 
// commit physical memory store, yet):
glBindTexture (GL_TEXTURE_2D_ARRAY, glID);
glTexStorageSparseAMD(GL_TEXTURE_2D_ARRAY,GL_RGBA, 
                     layerWidth, layerHeight, GLsizei(1), 
                     numLayers, 
                     GL_TEXTURE_STORAGE_SPARSE_BIT_AMD); 

// Commit (allocate) two new tiles (=pages):          
glTexSubImage3D (GL_TEXTURE_2D_ARRAY, mipLevel, 
                 offsetX+1,offsetY,layerID, 
                 layerWidth, layerHeight,1, 
                 GL_RGBA, GL_UNSIGNED_BYTE, data);
glTexSubImage3D (GL_TEXTURE_2D_ARRAY, mipLevel, 
                 offsetX+2,offsetY,layerID, 
                 layerWidth, layerHeight,1, 
                 GL_RGBA, GL_UNSIGNED_BYTE, data);
// deallocate a tile by passing NULL as the data pointer:
glTexSubImage3D (GL_TEXTURE_2D_ARRAY, mipLevel, 
                 offsetX, offsetY, layerID, 
                 layerWidth, layerHeight, 1,
                 GL_RGBA, GL_UNSIGNED_BYTE, NULL);
 
The commit-parameter toggles between block allocation and deallocation, a block is identified by the offset/size parameters. Committed memory can receive data via glTexSubImage*D(). Note that not all texture formats support sparse allocation, the offset* parameters are required to be a multiple of VIRTUAL_PAGE_SIZE_*_ARB.

Sparse Texture Fetches in a GLSL Shader

To fully utilise sparse textures, a custom shader needs to be active. Fixed function pipeline produces undefined results if a texel is accessed that is located in an uncommitted block, including reads required for filtering, mipmap generation, etc. AMDs extension defines GLSL access/test functions, ARB leaves behaviour undefined, and relies on another extension to provide GLSL functionality (EXT_sparse_texture2).

Applications

mummy volume rendering Imagine you had a volume of voxels, 8192x8192x2048, populated with rgba values. This volume sums up to 512GB of voxel data. Even if you use just one byte per voxel - 128GB are still too much for most graphics cards. With sparse textures, you can commit and upload slices of the volume, and handle the rendering sequentially.

Array textures tend to show the same explosion of data like volume textures, so having them only partially resident in gfx memory is equally useful. If you write some game using OpenGL, you may consider to manage your textures all with the same size and store them in a 2D texture array. This avoids binding textures all the time - just bind the array once and access individual textures via the z-coordinate. Sparse textures allow to commit individual members of the array and release others that are currently not needed to save memory.

Finally, you don't even have to store images in textures. Using (compute-) shaders, textures can encode arbitrary data to be processed on the GPU. In fact, tinyScenegraphs main motivation for sparse texture support was collision detection on the GPU. The tinyParticles plugin animates meshes on the GPU and uses a sparse 3D texture to lookup spacial relationships between objects for collision detection - but that is a different story, to be told another day...

Keep rendering,
Christian


Acknowledgements:



Copyright by Christian Marten, 2015
Last change: 28.08.2015