Answers to Review Questions - Game Programming Algorithms and Techniques: A Platform-Agnostic Approach (2014)

Game Programming Algorithms and Techniques: A Platform-Agnostic Approach (2014)

Appendix A. Answers to Review Questions

This appendix presents you with the answers to the review questions at the end of Chapters 112.

Chapter 1: Game Programming Overview

1. The early consoles were programmed in assembly language because they had an extraordinarily small amount of memory and processing power. A high-level language would have added too much overhead, especially because compilers at the time were not nearly as optimized as they are today. Furthermore, the early consoles typically did not have development tools that would have been able to facilitate high-level language usage, even if it were feasible.

2. Middleware is a code library written by a third party to solve a specific problem in games. For example, Havok is a physics library that many game companies use in lieu of implementing their own physics systems.

3. There are many possible answers to this question. Here is one:

For Galaga, there are two inputs to process: the joystick, which allows the player’s ship to move left or right, and the fire button. The majority of the work in the “update world” phase of the game loop would be to spawn and then simulate the AI for the enemy ship squads that come in. Furthermore, the game needs to detect collisions between the enemy projectiles and the player, as well as detect the player projectiles against the enemy. The game needs to keep track of how many waves have gone by and increment levels as appropriate. The only outputs for this game are video and audio.

4. Other outputs in a traditional game loop might include sound, force feedback, and networking data.

5. If the rendering portion takes roughly 30ms, and updating the world takes 20ms, a traditional game loop would take 50ms per frame. However, if the rendering is split into its own separate thread, it can then be completed in parallel with the world updates. In this case, the overall time per frame should be reduced to 30ms.

6. Input lag is the time delay between pressing a button and seeing the result of that button press onscreen. The multithreaded game loop increases input lag because the rendering must be one frame behind the gameplay, which adds one additional frame of lag.

7. Real time is the amount of time that has elapsed in the real world, whereas game time measures time elapsed in the game world. There are many instances where game time might diverge—for instance “bullet time” slow-mo would have game time be slower than real time.

8. Here is the code as a function of delta time:

position.x += 90 * deltaTime
position.y += 210 * deltaTime

9. To force a 30 FPS frame rate, at the end of each iteration of the loop, check how much time has elapsed for the iteration. If it’s less than 33.3ms, then the code should wait until 33.3ms has elapsed before continuing to the next iteration. If the elapsed time is greater than 33.3ms, then one option is to try to skip rendering on the next frame, in an attempt to catch up.

10. The three primary categories are objects that are drawn and updated, objects that are only updated, and objects that are only drawn. Objects that are drawn and updated covers most objects in the game, including the player, characters, enemies, and so on. Objects that are only updated include the camera and invisible design triggers. Objects that are only drawn include static meshes, such as trees in the background.

Chapter 2: 2D Graphics

1. The electron gun starts aiming at the top-left corner of the screen and draws across one scan line. It then aims down one line and starts at the left again, repeating this process until all the scan lines have been drawn. VBLANK is the amount of time it takes for the electron gun to reposition its aim from the bottom-right corner all the way back to the top-left corner.

2. Screen tearing occurs when the graphical data changes while the electron gun is drawing one frame. The best way to avoid screen tearing is to use double buffering, and to only swap buffers during VBLANK.

3. If a single-color buffer is used, it means that in order to prevent screen tearing, all rendering must be completed during the relatively short VBLANK period. But with double-buffering, the only work that must be completed during VBLANK is the swapping of buffers, and there is much more time to render the scene.

4. The painter’s algorithm is a rendering technique where the scene is rendered back to front. This is similar to the way a painter might paint a canvas. The painter’s algorithm works relatively well for 2D scenes.

5. Sprite sheets can net huge memory savings because it is possible to pack in sprites more tightly than it might be with individual sprite files. There also can be performance gains from the fact that with sprite sheets, source textures do not have to be changed as frequently in the hardware.

6. In single-direction scrolling, the camera’s position should update when the player has advanced past the midpoint of the screen.

7. A doubly linked list would work best, because it would be possible to point to both the previous background segment and the next background segment.

8. A tile set contains all the possible images that can be used for the tiles. Each tile in the tile set has a corresponding ID. The tile map, on the other hand, lists out the IDs of the tile layout of a particular scene or level.

9. By storing the animation FPS as a member variable, it makes it possible to alter the speed of an animation as it is playing. For example, a run animation could start playing more rapidly as the character moves more quickly.

10. An isometric view displays the world from a slight angle, so there is a greater sense of depth.

Chapter 3: Linear Algebra for Games

1. Solutions:

a. Image5, 9, 13Image

b. Image4, 8, 12Image

c. Image

2. If you only care about the direction of a vector, it is probably worthwhile to normalize this vector. If you need both the direction and magnitude of the vector, you should not normalize.

3. ||AP||2 <||BP||2

4. Project Image onto Image to get Image, which extends all the way to the start of Image.

The vector Image goes from Image to Image, so Image.

Substitute for Image to get the final solution: Image.

5. Solution:

Image

Normalize Image and Image.

Image

Solve for θ using the dot product.

Image

6. This is Image×ImageImage0.23, – 0.12, – 0.47Image.

You should normalize this, for a final value of Image0.43, – 0.23, – 0.86Image.

7. This will return the normal that faces in the opposite direction.

8. Take the player’s forward vector and cross it with the vector from the player to the sound effect. If the z value of the cross is positive, the sound should come out of the left speaker. Otherwise, it should come out of the right.

9. Result:

Image

10. When the matrix is orthonormal.

Chapter 4: 3D Graphics

1. Triangles are the simplest type of polygon and can be represented with only three vertices. Furthermore, they are guaranteed to lie on a single plane.

2. The four primary coordinate spaces in the rendering pipeline are model space, world space, view/camera space, and projection space. Model space is the coordinate space relative to the origin point of the model (which is often set in a modeling program). World space is the coordinate system of the level, and all objects in the world are positioned relative to the world space origin. View space is the world seen through the eyes of the camera. Projection space is the coordinate space where the 3D view has been flattened into a 2D image.

3. The order of multiplication should be rotation matrix multiplied by the translation matrix:

Image

4. An orthographic projection doesn’t have depth perception, which means objects further away from the camera are not smaller than objects closer to the camera. On the other hand, a perspective projection does have true 3D depth perception.

5. Ambient and directional lights both globally affect the scene. However, ambient light is uniformly applied to the entire scene. On the other hand, directional lights come from one specific direction, and because of this they will not illuminate objects on all sides.

6. The three components of the Phong reflection model are ambient, diffuse, and specular. The ambient component uniformly applies to the entire object. The diffuse component depends on the position of lights in relation to the surface, and is the primary reflection of light. The specular components are the shiny highlights on the surface, and depend on both the light position and the position of the camera. Phong reflection is considered a local lighting model because each object is lit as if it were the only object in the scene.

7. Gouraud shading is per-vertex, whereas Phong shading is per-pixel. Phong shading looks much better on low-poly models because of this, but it also ends up being more computationally expensive.

8. The biggest issue with using the painter’s algorithm in a 3D scene is that the process of sorting the scene back to front is time consuming, especially when given the fact that the camera positions can change dramatically. It also can potentially cause too much overdraw, which is when a pixel is drawn over during the process of rendering a frame. Finally, overlapping triangles may have to be broken down into smaller triangles for the painter’s algorithm. Z-buffering solves these problems by doing a depth check at each pixel prior to drawing it—if there is a pixel already drawn with a closer depth than the current pixel, the current pixel does not get drawn. This allows the scene to be drawn in an arbitrary order.

9. Euler rotations only work on the coordinate axes, whereas quaternions can rotate about any arbitrary axis. Euler angles can also suffer from gimbal lock, and they are difficult to interpolate between.

10. Solve for the vector and scalar components:

Image

Chapter 5: Input

1. A digital device is one that is binary—it can be either in an “on” state or an “off” state. One example of a digital device is a key on the keyboard. An analog device, on the other hand, is one that has a range of values. An example of an analog device is a joystick.

2. Chords involve pressing multiple buttons at the same time in order to get a particular action to occur, whereas sequences are a series of button presses.

3. Without “just pressed” and “just released” events, button presses will be detected over the course of multiple frames. For example, if you press the spacebar quickly, at 60 FPS it is likely the spacebar will be detected as pressed for several frames, which in turn will trigger the spacebar’s action for multiple consecutive frames. On the other hand, if only “just pressed” is used, only the first frame where the spacebar is pressed will trigger the action.

4. Analog filtering attempts to solve the problem of spurious, low-value inputs. For example, with a joystick it is rare to get precisely (0,0) when the joystick is at rest. Instead, a range of values around zero will be detected. Filtering will make sure that the range of values around zero are considered equivalent to zero input.

5. With a polling system, there is a possibility that different points in the code will get different results if a key press changes in the middle of the frame. Furthermore, a polling system may unnecessarily query the same button multiple times in multiple places. This can lead to code duplication. An event system, on the other hand, tries to centralize all the input so that interested parties can register with the centralized system. This also guarantees that a particular key will have only one value on one frame.

6. As long as the programming language supports storing a function (or a pointer to a function) in a variable, it is possible to simply store a list of all functions registered to a particular event.

7. One method that can ensure the UI has the first opportunity to process events is to first pass the container of triggered events to the UI. The UI can then respond to any events it chooses to, and then remove those events from the container. The remaining events are then passed to the game world code.

8. The Rubine algorithm is a pen-based gesture-recognition algorithm designed to recognize user-drawn gestures. The algorithm has a two-step process. First, a library of known gestures is created that stores several mathematical features of the gesture. Then, as the user draws a gesture, the mathematical properties of that gesture are compared against the library to determine if there is a match.

9. An accelerometer measures acceleration along the primary coordinate axes, whereas a gyroscope measures rotation about said axes.

10. An augmented reality game might use the camera in order to “augment” the world, or alternatively use the GPS in order to detect nearby players or obstacles.

Chapter 6: Sound

1. Source sounds are the actual audio files created by a sound designer. These are usually WAV files for sound effects and some sort of compressed format for longer tracks. The metadata describes how and in what context the source sound files should be played.

2. Switchable sound cues allow you to have different sets of sounds that play in different circumstances. For example, footsteps sound differently depending on the surface the character is walking on. With a switch, it would be possible to switch between different footstep source sounds, depending on the surface.

3. The listener is a virtual microphone that picks up all the sound that plays in the world, whereas the emitter is the object that actually emits the sound. In a first-person shooter (FPS), the listener might be at the camera, while an emitter might be tied to every non-player character (NPC), as well as other objects that may give off sound (such as a fireplace).

4. For a third-person game, both the position and orientation of the listener are very important. The orientation should always be camera relative, not player relative. If it’s player relative, you could have a scenario where an explosion that occurs on the right part of the screen comes out of the left speaker, which is not desirable. The position of the listener is often placed somewhere between the camera and the player position.

5. The decibel scale is a logarithmic scale. Going from 0 to -3 dB cuts the volume of the sound roughly in half.

6. Digital signal processing involves taking a signal and transforming it in some way. For audio, this is often used to alter a sound on playback. One example is reverb, which creates an echo. Another example would be a pitch shift, which increases or decreases the pitch of the sound. Lastly, a compressor would normalize the volume levels by increasing the volume of quieter sounds and decreasing the volume of louder sounds.

7. Oftentimes, a DSP effect such as a reverb only needs to affect part of the level. Therefore, it is valuable to be able to mark which regions within the level are affected by DSP.

8. Using a convex polygon to mark DSP might not work if the level has some parts that are above or below other parts. For example, if a field has a tunnel running under it, a convex polygon surrounding the tunnel would incorrectly also affect when the player runs above the tunnel.

9. The Doppler effect occurs when a sound emitter is quickly moving toward or away from you. As it moves toward you, the pitch increases, and as it moves away, the pitch decreases. This is most commonly seen with sirens on emergency vehicles.

10. Sound occlusion is when the sound waves must go through a surface to get to the listener. This results in a low-pass filter effect, which means that high frequency sounds become harder to hear. In obstruction, although there is no a direct path to the listener, the sound can go around the surface.

Chapter 7: Physics

1. An axis-aligned bounding box has sides that must be parallel to the coordinate axis (or coordinate planes, in the case of 3D), which makes calculations with AABBs a bit easier to compute. An oriented bounding box does not have this restriction.

2. P·Image + d = 0

P is an arbitrary point on the plane, Image is the normal to the plane, and d is the minimum distance from the plane to the origin. In a game engine, we will typically store Image and d in our plane data structure.

3. A parametric equation is one where the function is defined in terms of an arbitrary parameter (most often t).

A ray can be represented by R(t) = R0 + Imaget.

4. Two spheres intersect if the distance between their centers is less than the sum of the radii. However, because the distance requires a square root, it is more efficient to check if the distance squared is less than the sum of the radii squared.

5. The best approach for 2D AABB-AABB intersection is to check for the four cases where the AABBs definitely cannot intersect.

6. In line segment vs. plane intersection, a negative t value means that the segment is facing away from the plane.

7. An instantaneous collision detection algorithm only checks if the two objects are actively colliding on the current frame, whereas a CCD algorithm will also find collisions between frames.

8. For swept sphere, if the discriminant is negative, it means the two spheres do not collide. If it’s equal to zero, it means there is a t such that the two spheres tangentially intersect. Finally, if it’s positive, it means the two spheres fully intersect, and the smaller solution to the quadratic is the first point of intersection.

9. The accuracy of numeric integration methods ties directly to the size of the time step. If the time step is variable, the numeric integration methods will be volatile.

10. Velocity Verlet integration first computes the velocity at the midpoint of the time step, using the acceleration from last frame. This velocity is then applied to the integration of the position over the full time step. Then the acceleration is updated for the current frame, and this acceleration is applied to the computation of the velocity from the midpoint to the end of the time step.

Chapter 8: Cameras

1. Field of view is the angle that represents what is visible in the game world. A narrow field of view can be a source of motion sickness, especially in PC games.

2. The position of the basic follow camera can be computed using the following equation:

eye = target.position - target.forward * hDist + target.up * vDist

Once the eye position is computed, the camera forward vector is simply the vector from the eye to the target, which is as follows:

forward = eye - target.position

Then the up vector of the camera can be computed by crossing the target up with the camera forward to get camera left, and then crossing camera forward with camera left to get the camera up.

3. A spring follow camera has both an ideal and actual camera position. The ideal position updates every frame based on the rigid basic follow camera, while the actual camera position lags behind a little bit. This creates a much more fluid camera experience than the rigidity of the basic follow camera.

4. The camera’s position in an orbit camera should be stored as an offset from the target object, because it allows the rotations to more easily be applied.

5. The target position is stored as an offset from the camera, and as the character rotates and looks around, the target position must be rotated as well.

6. A Catmull-Rom spline is a type of spline curve that can be defined by a minimum of four control points: one before and one after the two points that are being interpolated between.

7. A spline camera can be very useful for cutscenes where the camera needs to follow a set path.

8. If a follow camera is being blocked by objects, one solution is to perform a ray cast from the target to the camera and then place the camera position at the first point of collision. Another solution is to reduce the alpha of objects that are between the camera and the target.

9. In an unprojection, a z-component of 0 corresponds to the 2D point on the near plane, whereas a value of 1 corresponds to the same point on the far plane.

10. Picking can be implemented by unprojecting the 2D point onto both the near plane and the far plane. Then, a ray cast between the two points can be performed to determine which object is intersected with.

Chapter 9: Artificial Intelligence

1. Node 0: {1, 4}; Node 1: {0, 2, 4}; Node 2: {1, 3}; Node 3: {2, 4}; Node 4: {0, 1}.

2. A typical area requires fewer nodes when represented by a navigation mesh as opposed to path nodes. At the same time, you can get more coverage of the area with a navigation mesh.

3. A heuristic is considered admissible if its estimated cost is less than or equal to the actual cost.

4. Manhattan: 6. Euclidean: Image.

5. The selected node has a star:

Image

6. Dijkstra’s algorithm.

7. When evaluating the cost of visiting a node, the A* algorithm takes into account both the actual cost from the start node and the estimated cost to the end node. Dijkstra’s algorithm only looks at the actual cost from the start node. The A* algorithm is generally more efficient than Dijkstra’s because it usually does not visit nearly as many nodes.

8. There are many potential solutions to this question. Here is one:

Image

9. The state design pattern allows the behavior of one specific state to be encapsulated in its own separate class. This allows states to become reusable and/or interchangeable. It also means that the “controller” class can be far more clean than in the alternative enumeration implementation.

10. A strategy is the overall vision for an AI player. A common approach is to break down the strategy into several goals that, when prioritized, can then be achieved by the AI through discrete plans.

Chapter 10: User Interfaces

1. A menu stack provides a couple different benefits. First of all, it ensures that the user can easily return back to a previous menu by popping the current menu off of the stack. Furthermore, it allows multiple UI elements to be drawn on top of each other, such as when a dialog box pops up to confirm/reject a particular action.

2. Letter key codes are typically sequential, so the key code for C is one after the key code for B, which is one after the key code for A, and so on. This property is helpful because you can check whether a key code is within the range of A through Z and then simply subtract the key code for A from it. This gives the letter offset in the alphabet and can then be added to the character “A” to get the appropriate character.

3. Typically, there is a known 2D screen space position that a waypoint arrow should always be at. However, because the arrow is actually a 3D object, we need the correct world space position to draw it. The way to determine this is to take the 2D screen space position and unproject it, which will yield the appropriate 3D position.

4. In order to calculate the axis and angle of rotation for the waypoint arrow, we first must construct a vector from the 3D player to the waypoint. Once this vector is normalized, we can use the dot product between the new vector and the original facing vector to get the angle of rotation. Similarly, the cross product can be used to calculate the axis of rotation.

5. The aiming reticule is always at a 2D point on the screen. We take this 2D point and perform two unprojections: one at the near plane and one at the far plane. The code can then construct a ray cast between these two points, and check to see the first object this ray cast intersects with. If it’s an enemy, the reticule should be red, and if it’s a friendly, it should be green.

6. In order to take a 3D coordinate and convert it into a 2D one for the radar, we ignore the height component. In a y-up world, this means we create a 2D vector that uses the x-component and the z-component of the 3D vector.

7. Absolute coordinates are a specific pixel location on screen, whereas relative coordinates are relative either to a key point on the screen (such as the center or the corners) or to another UI element.

8. Hard-coding UI text makes it difficult for nonprogrammers to change the text in development. Furthermore, hard-coded text cannot be easily localized. This is why text should be stored in external text files.

9. The ASCII character set can only display English letters. The Unicode character set solves this problem in that it supports many different writing systems, including Arabic and Chinese. The most popular Unicode representation is UTF-8, which is a variable-width encoding for characters.

10. User experience refers to the reaction a user has as he or she is using an interface. A well-designed game UI should have a positive user experience.

Chapter 11: Scripting Languages and Data Formats

1. A scripting language might allow more rapid iteration of game elements, and also makes it more accessible for nonprogrammers to edit. But the disadvantage of a scripting language is that its performance is not going to be as good as a compiled language such as C++.

2. A scripting language might make sense for the camera, AI state machines, and the user interface.

3. UnrealScript is one custom scripting language that’s popular for games. The advantage of a custom scripting language is that it can be tailored specifically for game use. In the case of UnrealScript, this means it supports state functionality. It also can have its performance tailored for game applications. The disadvantage, however, is that a custom language may have a lot more errors in its implementation than a general language.

4. A visual scripting system allows the use of basic flow charts to drive logic. It might be used for the logic in a particular level, such as when and where enemies spawn.

5. The statement int xyz = myFunction(); would break down into the following tokens:

int
xyz
=
myFunction
(
)
;

6. The regular expression [a-z][a-zA-Z]* matches a single lowercase letter followed by zero or more lower- or uppercase letters.

7. An abstract syntax tree is a tree structure that stores the entire syntactical layout of a program. The AST is generated in syntax analysis and can then be traversed in a post-order manner either to generate or execute its code.

8. A binary file format is typically going to be both smaller and more efficient to load than a text-based one. But the problem is that it’s much more difficult to determine the difference between two versions of a binary data file in comparison to a text data file.

9. For basic configuration settings, an INI file makes the most sense because it’s very simple and would therefore be easy for the end user to edit.

10. The two main components of UI add-ons in World of Warcraft are the layout and the behavior. The layout is implemented in XML and the behavior is implemented in Lua.

Chapter 12: Networked Games

1. The Internet Protocol is the base protocol that needs to be followed in order to send any data over the Internet. The main difference between IPv4 and IPv6 is how the addresses are configured. In IPv4, addresses are a set of four bytes, for a total of about four billion possible combinations. This seemed like a lot when IPv4 was introduced, but in recent years addresses have been running out. IPv6 increases the number of address combinations to 2128, which means it will not run out of addresses any time soon.

2. ICMP, or Internet Control Messaging Protocol, is designed mostly for acquiring status information about a network and connections. One use for games, however, is the echo request/echo reply feature of ICMP. This allows games to “ping” and determine the amount of time it takes a packet to travel a round trip.

3. TCP is connection based, which means two computers have to perform a handshake before any data can be transmitted. Furthermore, TCP guarantees that all packets will be received—and received in the order they were sent.

4. A port can be thought of a virtual mailbox. Each computer has approximately 65,000 ports for TCP and 65,000 ports for UDP. Certain ports are traditionally used for certain types of data (such as 80 for HTTP servers), but there is not necessarily a strict use of a particular number. The importance of ports is that they allow multiple applications on the same computer to be using the network without interfering with each other.

5. UDP is generally more preferable for real-time games because the additional guarantees that TCP provides might cause unnecessary slowdown. A great deal of information is not mission critical and therefore should not be resent if it’s lost. But with TCP, everything will be resent until an acknowledgement is received, which could substantially delay the data being processed if it’s received out of order.

6. In a server/client model, all clients connect to one central server. This means that the server needs to be more powerful and have more bandwidth than the clients. In a peer-to-peer model, every computer is connected to every other computer, which means that the requirements are symmetrical.

7. The server will periodically send updates regarding what the other players in the game are doing. Client prediction tries to fill in the blocks of time in between so that the overall gameplay can be smooth. Without client prediction for movement, opposing players would teleport around the screen every time an update is received, which would not be very fun.

8. In a lockstep model, the game is broken down into turns that are 150ms–200ms in length. All input commands during that turn are queued up until the end of the turn. Each peer them simulates the result of all of said input commands. The end result is a tightly synchronized game where very little information needs to be sent.

9. One example of an information cheat would be adding a radar to a game where one does not exist. A radar could be created using the position information that’s transmitted to all the clients. One way to solve this would be to have the server only send information regarding players the particular client should reasonably be able to see.

10. A man-in-the-middle attack is where a computer is placed in between the two computers that are communicating. This can be a big issue because it allows a nefarious person to intercept and read packets. The primary solution to this problem is to use an encryption scheme where the server is the only party that can read the contents of the packet.