In this post, I show how to get the z of a pixel using the OpenGL Z-Buffer. I use it to identify the tile below the mouse cursor. This approach is faster than ray casting, as it let the GPU do the job!
This post is part of the OpenGL 2D Facade series
To check that it works fine, the player click on items in the world, and the character tells what it is:
The usual approach is to cast a ray from the pixel and find the closest intersecting face. In 2D, we look for all the faces that contain the pixel. Since our faces are rectangles, the computation of the intersection is simple. On layers with regularity, like grids, it can be even easier. Once we found faces that contain the pixel, we read the tile texture to see if the pixel is transparent, in which case we ignore the face. In the end, we select the face with the lowest depth value.
As you can imagine, ray casting requires many computations. With the approach based on the Z-Buffer, we can reduce that do almost nothing and save CPU time for other tasks.
We can ask OpenGL for any value of the Z-Buffer. For instance, we can get the Z-Buffer of a pixel (x,y):
data = glReadPixels(x, screenHeight - 1 - y, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT) zbuffer = float(data)
Remind that the Y-axis of OpenGL is bottom-up, this is why we invert y.
This zbuffer value is in [0,1], so we need to convert it to NDC (Normalized Device Coordinates):
z = 2 * zbuffer - 1
Finally, we "linearize" this z value to get the depth of the pixel, as shown in the previous post:
zNear = 0.001 zFar = 1.0 maxDepth = 65536 a = maxDepth * zFar / (zFar - zNear) b = maxDepth * zFar * zNear / (zNear - zFar) depth = a + b / z
With these settings, the depth value is between 0 (front) and 65535 (background).
We extend the
ZBuffer class with these formulae:
class ZBuffer: zNear = 0.001 zFar = 1.0 maxDepth = 65536 a = maxDepth * zFar / (zFar - zNear) b = maxDepth * zFar * zNear / (zNear - zFar) def depth2z(depth: float) -> float: return ZBuffer.b / (depth - ZBuffer.a) def z2depth(z: float) -> float: return ZBuffer.a + ZBuffer.b / z def zbuffer2z(zbuffer: float) -> float: return 2 * zbuffer - 1 def zbuffer2depth(zbuffer: float) -> float: return ZBuffer.z2depth(2 * zbuffer - 1)
We also add a new method in the OpenGL facade that returns the depth of a pixel (x,y):
def getPixelDepth(self, x: int, y: int) -> float: data = glReadPixels(x, self.screenHeight - 1 - y, 1, 1, GL_DEPTH_COMPONENT, GL_FLOAT) zbuffer = float(data) depth = ZBuffer.zbuffer2depth(zbuffer) return depth
Since we assign a range of depth values for each layers, we can find the layer of a pixel. It is implemented in the
getPixelLayer() method of the facade:
def getPixelLayer(self, x: int, y: int) -> Tuple[Union[None, LayerGroup], int, Union[None, Layer], int]: depth = int(round(self.getPixelDepth(x, y))) for layerGroupIndex, layerGroup in enumerate(self.__layerGroups): if layerGroup is None: continue for layerIndex, layer in enumerate(layerGroup): if layer is None: continue if layer.hasDepth(depth): return layerGroup, layerGroupIndex, layer, layerIndex return None, -1, None, -1
hasDepth() method of facade layers: it returns
True if the layer uses the depth value,
False otherwise. The implementation of these methods depends on each case and is straightforward.
Finding the face of a pixel depends on the type of the layer. In the case of a grid, we want the cell coordinates of the face. We add a new method
getPixelCell() in the
def getPixelCell(self, x: int, y: int) -> (int, int): depth = int(round(self._gui.getPixelDepth(x, y))) viewX, viewY = self._layerGroup.getTranslation() cellX = (x + viewX) // self.tileWidth for cellY, rowDepth in enumerate(self.__depths): if rowDepth == depth: return cellX, cellY return -1, -1
Line 2 gets the depth of the pixel. We need it to find the right cell.
Line 3 gets the current shift of the layer. The coordinates of the pixel are relative to the screen or window; we need to translate them to world coordinates.
Line 4 translates the x screen/window coordinate to cell world coordinate. Note that we can't do the same with y coordinates because there are items larger than a row. For instance, big trees are two tiles tall.
Lines 5-7 parse all depths used by the layer and return the cell y coordinate corresponding to the pixel's depth.
In the case of a characters layer, we want all the characters at some pixel location. We add a new method
getPixelCharacterIndices() in the
def getPixelCharacterIndices(self, x: int, y: int) -> List[int]: depth = int(round(self._gui.getPixelDepth(x, y))) if not self.hasDepth(depth): return  viewX, viewY = self._layerGroup.getTranslation() return self.findFaces(x + viewX, y + viewY)
Lines 2-4 check that there is a character at screen/window coordinates (x, y). It can't be faster!
Line 5 gets the current shift of the layer to convert screen/window coordinates to world coordinates.
Line 6 uses a new method
findFaces() of the
OpenGLLayer class. It uses Numpy to find faces intersecting a given (faster than pure Python code):
def findFaces(self, x: float, y: float) -> List[int]: spriteScreenX = -1 + x * self.__mesh.screenPixelWidth spriteScreenY = 1 - y * self.__mesh.screenPixelHeight x1 = self.__vertices[:, 1, 0] y1 = self.__vertices[:, 1, 1] x2 = self.__vertices[:, 3, 0] y2 = self.__vertices[:, 3, 1] mask = (x1 <= spriteScreenX <= x2) and (y2 <= spriteScreenY <= y1) return mask.nonzero().tolist()
We assume that we won't get a lot of characters simultaneously (e.g., less than a thousand), so this procedure should always run fast.
I improved the text layers so they can display several texts. I also updated characters so they can have text on top of their head. I based these implementations on dynamic meshes, using a design I am not happy with. I'll present a better solution in the next post.
In the next post, I'll show how to create dynamic meshes.