How Physics Engines Work - Build New Games

Introduction, Motivation, and Goals

My name is Burak Kanber. I started programming in 1999 and built my first MMORPG in Perl in 2001. Years later I attended the prestigious (and tuition-free) Cooper Union, where I studied automotive and sustainable engineering. I earned my master’s in engineering simulating physics in the powertrain of hybrid cars. I’m presently an entrepreneur and the co-founder/CTO of Tidal Labs, which is proving that you can do awesome things as a bootstrapped startup in NYC. I love teaching (which I do often), and I still love physics. I scratched both of those itches by starting a series on my blog called Physics in JavaScript.

Eventually, Build New Games asked me to write an article on physics for video games. Game development is only a hobby – not my profession – so I did the only sensible thing and started building a legitimate physics engine in native JavaScript as research.

This article will guide you through the essential physics of game engines. This is not an A-Z “how-to” guide. I’ve left out some major optimization and implementation details; a single article can only cover so much. My true goal is to point you in the right direction and teach the concepts that you’ll need to build upon later. The next step is up to you. I hope you’ll use this article as a springboard and go write your own game engine, researching advanced concepts in the process.

I’ve left you a very helpful to-do list at the end of this article, if you’re just here looking for a summary of what you need to learn.

A Vector State of Mind

The first thing I learned when building my physics engine is that you really should embrace vector mathematics (linear algebra). I can already hear you groaning. For some reason, linear algebra always elicits a visceral reaction from people, but please allow me the opportunity to sway you.

Newton’s equations of motion – which dictate basically everything we’re about to do – have a very helpful mathematical quality. You can separate the equations of motion into smaller equations: one each for the x, y, and z axes.

What this means is that when you throw a baseball (in 2D, for argument’s sake) you can solve two separate equations for how it moves vertically and how it moves horizontally. When you throw a baseball, it moves forward at a constant velocity. But it also moves vertically at the same time – and its vertical motion is anything but constant, it’s affected by gravity! Newton’s equations allow you to look at these two aspects separately: “the baseball is moving forward at 90 mph but it’s also moving up and down like so under the influence of gravity”.

For many people, that’s how physics goes. You can just solve each dimension (forward and vertical motion) separately. But I’d encourage you not to do that. Why have three equations when you can have one? Sure, getting comfortable with vector math (again, called linear algebra) has a little bit of a steeper learning curve, but I’ve found that its so much better to work with. It’s worth it.

Points as Vectors

I decided to shift all my thinking into vectors. Vectors are like arrows–they have a length and they have a direction. They do not have an absolute position in space. The vector [5,3] points north-east, but it doesn’t have a “starting point”. Because of that, it may be strange to you that I decided to model individual points as vectors too. Any time I create a rectangle with one corner at (5,3), I actually model that point as the vector [5,3]. When modeling a point as a vector, simply make the assumption that the tail of the vector is at the point (0,0).

Why is this useful? The answer is twofold. First of all, it’s nice to have fewer data types. I’d rather have just one Vector class than a Vector and a Point class. But more importantly, it’s because math is great at abstracting things.

Here’s what I mean: if you’ve ever done any geometry, you’ve probably talked about points A and B. You can connect points A and B together to get the line segment AB. And when you’re doing pen-and-paper math, it’s that easy! You just write “AB” and you magically have a line segment. But programmers know it’s not that easy. We don’t simply manifest line segments. Fortunately, vector math steps in here. If you model points A and B as vectors, then you can get the line segment AB simply by doing the vector subtraction B-A.

Vectors simplify many of the common tasks you’ll need to do while writing a physics engine. Not only does linear algebra eliminate having to solve separate dimensions separately, it also helps you abstract away from “2D vs 3D” (because you’re not using variables like “positionX” and “positionY” and “positionZ” separately). Using vectors also helps you do intuitive things like adding and subtracting vectors–and keep in mind that you can model points as vectors, so any math you can do to vectors you can do to points as well, including rotation around any other point.

In short: get comfortable with vectors. The first thing you should do is write a simple vector class that does vector addition, subtraction, multiplication, dot product, cross product, and rotation. That’s all you need. The rest of your physics engine will be built atop vectors.

Newton’s 2nd Law of Motion

Newton’s 2nd Law of motion, which you may remember from high school, relates an object’s acceleration to the amount of force applied to that object. The formula itself is:

F = m a

The “F” and “a” here represent vectors. This means that, if you wanted to, you could split this equation up into three parts and use each separately:

F_{x} = m a_{x}

F_{y} = m a_{y}

F_{z} = m a_{z}

Because, as I mentioned earlier, objects move independently in each of their degrees of freedom. A baseball flying through the air succumbs to gravity in the “y” direction (making it accelerate downward), and moves forward at a steady pace in the “x” direction. Those two effects can be thought of, and even calculated, completely separately. When you combine your results for the “x” axis motion with the “y” axis motion, you get the familiar curved path that you’d expect the baseball to take.

Newton’s 2nd law is invaluable to us, and will drive nearly all of our objects’ interactions (there are some exceptions when talking about collision response). The reason this formula is so powerful is twofold: first, forces are how objects in the real world interact with each other; second, if you know an object’s acceleration, you can calculate both its velocity and its position. If you can model the forces correctly, you can model nearly any physical scenario accurately.

The process for simulating an object’s motion goes something like this:

Figure out what the forces are on an object
Add those forces up to get a single “resultant” or “net” force
Use F = ma to calculate the object’s acceleration due to those forces
Use the object’s acceleration to calculate the object’s velocity
Use the object’s velocity to calculate the object’s position
Since the forces on the object may change from moment to moment, repeat this process from #1, forever.

That’s pretty easy! Six straightforward steps is all it takes to model a simple object translating in any number of dimensions.

Of course, there are some caveats here. The above works for translation (moving left/right, up/down, back/forth), but to figure out rotation we need to modify it a little bit. I’ll talk about that in a few.

The other caveat is that sometimes it’s hard to figure out what exactly the forces on the object are supposed to be. Fortunately, game physics isn’t generally too crazy, and we only really need to think about gravity and springs and rockets and relatively simple things like that. It’s doubtful you’ll have to model magnetic fields or anything very advanced… though MIT Game Lab did just release a game where they model the effects of special relativity as if the speed of light were close to walking speed.

Calculating Velocity and Position

Acceleration, velocity, and position are closely related through calculus. Concretely speaking, velocity is the “rate of change over time” of position – in calculus, this is called the “time derivative” or simply the “derivative” for short. Think of the way we measure velocity: in miles per hour, or meters per second. Velocity is just a measure of distance travelled in a measure of time. Similarly, acceleration is just a measure of velocity changed in a measure of time.

The relationship between position, velocity, and acceleration is very helpful to us, because if we know the history of an object’s position, we can figure out its velocity by asking “how much did this object move in the last second?” We can calculate acceleration similarly.

More importantly, we can work backwards. If we know an object’s acceleration, we can figure out how much the velocity is supposed to change every second. And if we know the velocity, we can figure out how much the position is supposed to change every second. This technique is called “integration”. What we’re about to do is called numerical integration–in contrast with analytical integration which requires a pencil, paper, and a calculus textbook.

Every method of numerical integration for Newton’s laws involves some manner of maintaining a clock and doing calculations for tiny little snapshots in time. Game engines sometimes have the graphics engine running at 30 fps and the physics engine at 60 fps–in that case, the numerical integration is using a timestep of 1/60 seconds (16 milliseconds).

The simplest way to perform numerical integration is called Euler’s method. Here’s some pseudocode (assume that acceleration, velocity, and position all start at 0). In this example, force, acceleration, velocity, and position are all vectors–so this pseudocode handles 2D and 3D alike.

acceleration = force / mass
velocity += acceleration * time_step
position += velocity * time_step

Euler’s method is great to build a conceptual understanding from, but it’s not terribly accurate. In many situations, this technique becomes unstable and can yield some unpleasant results. So we’ll go a step further and use what’s called Velocity Verlet integration. Instead of the above, we can do the following:

last_acceleration = acceleration
position += velocity * time_step + ( 0.5 * last_acceleration * time_step^2 )
new_acceleration = force / mass 
avg_acceleration = ( last_acceleration + new_acceleration ) / 2
velocity += avg_acceleration * time_step

The most significant difference between the two methods is the use of the average acceleration between the last frame and the current one. This change gives us a significant accuracy advantage. Additionally, the 0.5 * last_acceleration * time_step^2 bit helps a little bit, but the squared time-step figure makes that correction very small: (1/60)^2 is 1/3600, after all.

You should also note that the new acceleration is calculated after the position has been updated; you’re therefore using the current frame’s position to calculate the new forces, rather than using last frame’s position to calculate the new forces.

Checkpoint #1

We still have a lot of material to cover, so I’m going to give you a homework assignment: look up how to model a variety of physical forces. Specifically, look at the weight force, spring force, viscous damping force, and maybe air drag. You can use any and all of these forces in combination with one another to create very realistic scenarios.

The great thing about modeling physics is this: if you model the forces correctly, you’ll get objects that behave realistically. You don’t need to think about directly modeling complicated interactions. If you have a ball that you want to drop from the sky, but you also want it to reach terminal velocity, then just model both the weight and air drag forces. You don’t need to think about any crazy hand-written logic to make the ball reach terminal velocity; the physics just handles that for you. And if you decide to change the environment from air to water, then you won’t need to rejigger your tweens or animations at all; you just change the value of “density” in the air drag formula. Physics takes care of the rest.

This example – a ball falling from the sky under the influence of gravity and air drag – will be the only one I don’t use vectors for. I want to ease you into the process, one step at a time, so this example will only have vertical (1D) motion. You could make it 2D by simply repeating the calculations for the x-direction.

I recommend you fork the JSFiddle below and change the value of “m” to see the effect of air drag on objects of different masses. A mass of 1 drops like a stone, but if you change the mass to 0.01, the ball quickly reaches a low terminal velocity and floats down. You can also change the density variable to see how the ball would behave in other fluids, like water or helium.

Rigid Bodies

We’ve essentially just modeled a “particle”. Particles are point-masses that don’t have a shape or size, and they don’t have an orientation (meaning they don’t rotate). Particles are the simplest physical entity to model.

Modeling something with shape adds a whole degree of complexity to our work, because things with shape should be able to rotate. This puts us in the realm of “rigid body physics.” Rigid bodies have a shape and orientation. They can rotate, and they can collide with each other.

We’re constraining ourselves to 2D for now, so we’ve only had to deal with two degrees of freedom thus far (the x- and y-axes). I’ll now introduce a third degree of freedom: rotation about the z-axis. The z-axis sticks straight out of the screen and points at your face. When I say “rotation about the z-axis”, you can picture a shape being impaled by a spike that sticks straight into your screen, and the object is allowed to spin around that spike.

If we were working in full 3D, we’d have 6 degrees of freedom to deal with: translation in x, y, and z (called “sway”, “heave” and “surge”), and rotation about x, y, and z (called “pitch”, “yaw”, and “roll”).

Once you understand the 2D examples we’re doing here it’s not too much of a stretch to go full 6DoF (six degrees of freedom). One difference between our 2D and full 3D is the potential involvement of a transformation matrix for 6DoF motion; here we’re using vector functions to handle translation and rotation, but the full 6DoF motion can be handled nicely by a single transformation matrix operation that handles both translation and rotation in one (sort of) neat matrix operation. You can still pull off 6DoF motion using only vectors, but just in the same way we’ve come to appreciate the ease of using vectors for 2D, you’ll appreciate the ease of using the transformation matrix in 3D.

Keep your eye on my blog for some 3D physics with Three.js in the future!

Rotation

Newton’s 2nd Law of Motion can be extended to rotation in a relatively straightforward manner. We’ll only need to update the variables slightly to solve for rotation.

First of all, our state variables for position, velocity, and acceleration change. We now need to think about rotation angle, angular velocity, and angular acceleration. We’ll use the Greek symbols theta (θ), omega (ω), and alpha (α) for these respectively. Since our example only allows rotation around the z-axis, these values will be scalars and not vectors like their linear counterparts. The angle theta is dimensionless and typically measured in radians. Angular velocity is measured in radians per second, and acceleration is measured in radians per second-squared.

Next, we replace “mass” with “moment of inertia”. Mass–or inertia–is an intrinsic quality of an object that resists acceleration (the more mass an object has, the slower it is to accelerate under a fixed force). Therefore, moment of inertia is a property of an object that resists angular acceleration.

Different shapes have different moments of inertia, even if they’re the same mass. For example, spinning up a 3-pound disc is easier than spinning a 3-pound hoop of the same size, because the hoop has all its mass towards the outer edge. To make matters more complicated, the moment of can change depending on what point and axis of an object you’re trying to rotate around. For these reasons, we’re just going to use a simple rectangle so we don’t get caught up in the calculus of moments of inertia. You can look up the moments of inertia of common shapes; if you need to model a complicated shape, then you can do so with some heavy-handed calculus and the triple integral function.

Finally, we replace force with “torque”. Torque is a force that has a tendency to rotate an object. Torque can be calculated by figuring out how far a force is applied from the center of rotation of an object–but we also have to only look at the component of the force that pushes at a 90 degree angle to the center point of our object.

For example, if you’re using a wrench that’s one foot long and you push on it at an exact 90 degree angle with 5 pounds of force, the torque on the bolt is 5 foot-pounds. If you push at a different angle, the torque will be less. If you pushed directly in towards the bolt from the end of the wrench, the torque would be zero and the wrench won’t move!

So we can now rewrite Newton’s 2nd Law of Motion for rotation. I’ve written this with scalars because we’re only thinking about rotation about the z-axis today.

T = J a

Seems too easy! We can also integrate this using exactly the same techniques as we used for translation. The hardest part about this for beginners is typically figuring out how forces result in torques, but with some hard work, diagrams, and a linear algebra refresher, it quickly becomes a trivial task. Hint: review the vector cross product. The cross product of the force vector and the “displacement” vector (the distance from the object’s center to where the force is applied) that gives you the resulting torque on the object. You’ll also want to brush up on your trigonometry to make sure you’re very comfortable projecting vectors and determining vector components (remember “SOHCAHTOA”?).

Checkpoint #2

This example models a rigid body rectangle. The rectangle is attached to the ceiling at one of its corners by a spring, and is allowed to fall. The spring stretches and grabs the corner, making the rectangle bounce and rotate.

There are various forces modeled in this example. The weight of the rectangle is modeled as a force. The spring is modeled as a force, and since that force is not applied to the center of the rectangle we also need to calculate a torque that will cause the rectangle to rotate. If you were to connect the spring to the center of the box, it wouldn’t spin at all (in a perfect world). I’ve also modeled viscous damping, both for translation and rotation. Viscous damping is a good way to model the various aggregate frictional losses of different materials. Our spring isn’t perfect, and so it won’t bounce forever. Viscous damping eventually causes the box to stop bouncing and spinning.

You’ll also find a Vector class and a Rectangle class in this example. All the physics is done with vectors, and each time-step rotates and translates the box using vector math as well.

Collision Detection

The most complicated aspect of physics engines is collision detection. Collision detection is generally a very expensive operation, and since a game may have dozens or hundreds of objects, a lot of effort is put into optimizing collision detection algorithms.

Another layer of complexity arises when objects that can rotate collide. And yet another degree of complexity emerges when you try to collide objects with complex shapes. Checking if two circles collide is easy; you just check if the distance between their centers is less than the sum of their radii. Checking if two rectangles collide is a little trickier, but not too tough if they’re not allowed to rotate (these are sometimes called “AABB”, or “axis-aligned bounding boxes”). Once the rectangles are allowed to rotate the problem gets much trickier, and continues to be tricky for other shapes, like triangles and pentagons and arbitrary polygons.

Just so this discussion doesn’t get out of hand, we’re only going to consider convex polygons. If you can draw any line through a shape and it only cuts through that shape once, it’s a convex polygon. If you can draw a line that cuts through the shape two or more times, like a crescent moon or a star, then the polygon is concave and we’re not going to talk about those.

In fact, most engines will break a concave (difficult) shape into a bunch of convex (easier) shapes. So while we’re not discussing concave shapes here, you can extend these theories to fit nearly any application.

Because entire books have been written on collision detection (and still can’t do it justice) we’re going to have to hone in on only one aspect of collision detection. A collision detection routine may look something like this:

Split the world up into sections via a grid or a quadtree (or octree in 3D)
If two objects are in the same section of the grid, use a very simple collision routine to see if you should investigate further. Typically you’ll use AABB testing for this.
If you still think the objects are colliding, use a more thorough algorithm to get the final word on if they’re colliding.
If they are colliding, extract information about the collision and figure out how to make the objects react.

And, unfortunately, there’s yet another level of complexity hidden here: if you have very fast moving objects (like bullets), it’s possible that they’ll “hop” over their target from one frame to the next and avoid collision detection altogether, even though the bullet has passed through the target. The fix for this is called continuous collision detection, but for now we’ll only talk about discrete collision detection.

For now we’re only going to talk about step 3 above: how to tell if two objects are colliding. We’ll leave out optimizations (steps 1 and 2), but fortunately Build New Games already has an excellent article on broad-phase collision detection, which covers steps 1 and 2 above. We’ll only perform a very rudimentary collision response (step 4) and we’ll leave out continuous collision detection–so our solution may be buggy if you attempt to use small or fast objects.

Separating Axis Theorem

There are two predominant collision detection methods. One method, called GJK, is more commonly used for 3D collision detection because the algorithm we’re going to use is expensive when done in 3D. However, the separating axis theorem is widely used for 2D collision detection, and luckily for us it’s much easier to conceptualize and visualize.

Imagine two objects on the screen, and imagine we’re looking at them from the top-down perspective (for example, we’re in a helicopter looking at the roofs of buildings). One way to test if those two objects are colliding is to go down and stand with the objects on the screen (at street level, so to speak), and walk in a circle around them. If we ever see a gap between the two objects then we know for certain that they’re not colliding. If we finish our walkabout and didn’t ever see a gap between them then they must be colliding, because we’ve looked at them from every possible angle and never saw the two objects separated.

When we’re writing code to do this we can’t check the view from millions of angles around the objects, so we need a way to simplify the procedure. Luckily, we only have to check the view from a few key spots: one for each of the sides in our two shapes. If we’re checking to see if a triangle and a heptagon (which has 7 sides) intersect, then we have to check the view from ten different angles.

More concretely, what we’re trying to do is look at the objects’ shadows (projections) against a number of different axes, and testing to see if their shadows overlap. If we check all ten axes in our triangle/heptagon example, and the shadows overlap in all ten axes, then we have a collision. If there’s a gap on even one axis, then we can immediately stop looking because the two objects are definitely not colliding.

Moreover, if we’re using rectangles (which we will), we can simplify things further. Rectangles have two sets of parallel edges, and since we’re doing projections for each edge, we’ll have two sets of duplicate results for each rectangle. Therefore we can reduce the number of axes to check from eight to four when looking at rectangles. Just pick the left and top side of each rectangle to use as test axes (or any two perpendicular sides).

Vector math makes the separating axis theorem very easy. We can get our test axes simply by subtracting corresponding rectangle vertices, and projecting “shadows” onto those axes is just a matter of finding the dot product of each rectangle’s vertex and the test axis.

Collision Response

Determining that two objects are colliding is only half the battle. Once objects are seen to collide, they need to react to it in some way. Much of this will depend on your game mechanics. A bullet fired at a character should do damage. A missile should do damage and explode. But today we’ll look at a very physical response: two boxes knocking into each other and bouncing away.

Collision response for two simple shapes without rotation is pretty easy. Momentum, or mass times velocity, helps us out here. When two really hard objects collide, the total momentum of the two objects doesn’t change. Real objects aren’t perfectly hard, so we introduce a fudge factor called the coefficient of restitution, which basically models the elasticity of the collision.

Unfortunately, our rotating objects are a bit more complex. Not only will the two objects bounce off of each other, but their rotation will change based on which points on the objects are involved with the collision. Additionally, since objects in real life very rarely “overlap” the way they can in a computer simulation, we’ll typically backtrack and correct their positions so that they’re just touching and not overlapping at all.

This leaves us with the need to know more about the collision. We would like to know some characterization of the overlap between two shapes. If we do “rewind” the collision until the point where they just touch, we’d like to know the location of that point. Many times we’ll also want to know the “penetration depth” of one object into another.

Due to the overwhelming complexity of collision detection and collision response, we can’t get too in-depth here. As a simplification, let’s decide that we don’t need to rewind the collision and that as long as we can get one point that’s somehow involved in the collision we’ll make do. There are many more sophisticated methods available, and the best way to learn these is to study existing physics engines like Box2d or Bullet. Both Bullet and Box2d have been ported to JavaScript, but I would recommend checking out the original C++ source if you know how to read it; some ports are automatically performed and not optimized for readability.

Once we’ve found a point involved in a collision (see the next code example to see how I did that), there are two ways we can use that information in a collision response. We can either use information about the depth of the collision and generate a very large force acting on the two objects (like a very stiff spring due to the “compression” of the object’s material as one penetrates the other), or, more commonly, we can use an impulse response which means we simply modify the objects’ velocities (technically, momentum) without any regard for the forces involved.

Checkpoint #3

Our final checkpoint exhibits a rotating box falling and colliding with a fixed box. While the demonstration of the separating axis theorem is complete, the collision response leaves a bit to be desired from more sophisticated techniques.

This implementation of the separating axis theorem test is certainly not definitive, however, and you should craft your own to fit your use-case. This example does not quit early if there’s a test axis with no overlap; quitting early is a great performance optimization. Additionally, this example uses some heuristics to extract the rectangle vertex “most involved with” the collision.

More sophisticated implementations would include rewinding the collision to the moment of impact, and also returning the contact point and normal. Additionally, the collision response here is primitive and not physically accurate; it’s just there to demonstrate that the separating axis theorem works.

Conclusion

You should now have a better understanding of what’s required to build a physics engine (in any language). Your journey doesn’t end here, and for your convenience I’ve given you a to-do list below. If you enjoyed this article please be sure to follow @bkanber on Twitter, and be sure to check out my blog, where I talk about physics, machine learning, and other interesting things.

To-Do:

Learn the basic vector math functions and implement them: addition, multiplication, dot product, cross product, rotation, and length are all useful to calculate.
Learn about as many different forces as you can: weight (ie, gravity on Earth), spring force, viscous damping, air drag, static and dynamic friction, and gravitation (ie, gravity in space).
Learn numerical integration techniques: Euler’s method (for conceptual understanding), velocity verlet (for most simple games/kinematics), midpoint method (for most simple applications/general forces), Runge-Kutta methods (for advanced and accurate simulation).
Develop an understanding of how forces become torques when applied to a rigid body. This is shown with the spring force in checkpoint #2, but you should aim to develop an understanding of how forces become torques, and how objects behave when rotating.
Learn the separating axis theorem and develop a thorough intuitive understanding of it. This is important not just for collision detection but also to help reinforce your learning of vector math, since this technique uses several vector concepts in its implementation.
Learn how to simplify the separating axis theorem to be a basic axis-aligned bounding box test. You can use this as a cheap first-pass in your collision detection algorithm.
Look into quadtrees as an even more effective first-pass in your collision detection algorithms. Also think about the difference between using quadtrees and using a simple fixed grid. Choose whichever better fits your application; will your objects be evenly distributed throughout the game world, or will they be clumped together?
Learn continuous (as opposed to discrete) collision detection algorithms. You can still use the separating axis theorem or AABB tests, but the way you test objects is a little different.
Learn more sophisticated collision response techniques. There are force-based methods and impulse-based methods. The route you choose depends on how you model your objects and what information your collision detection algorithm returns. Learn about collision normals, angle of reflection and momentum.
Understand the GJK algorithm. If you’re working towards 3D this will be very helpful, but even if you’re only working in 2D it’s worth the effort. The GJK algorithm is very abstract and hard to develop intuition about, but it’s one of the most interesting algorithms I’ve seen and is based on equally interesting mathematical concepts.