Raytracing With Landscapes
Lately, I've been playing Genshin Impact and was blown away by the quality of the graphics in this game. Wow! Take a look at the terrain, the shadows cast by the abundant foliage, and the sunsets which change the usual greens and browns into a spectacular orange. All of this runs on my little iPhone SE, though you can get it on nearly any platform you prefer.
But Genshin is completely raster-shaded. It renders its graphics polygon by polygon, handling collision and lights on a per polygon basis. Meanwhile, a game like Cyberpunk is (in)famous for its raytraced graphics. Night City has so many levels and lights that it'd be unreasonable to process all that geometry. Instead, the game does it by simulating "photons" of light, where the path of the photon determines the color of each pixel. With this method, you can get more realistic effects and complicated behavior than raster.
Can we combine them? Genshin's landscapes with raytracing? Let's see.
This instructable is meant to be a roadmap for someone who is familiar with C++ or a similar language like Java and is interested in building a neat little engine from scratch. I provide room for you to make your own customizations and extensions. If you are looking for specific code, Ray Tracing in One Weekend is a great classic tutorial (and it has heavily inspired the code snippets below as well).
The Instructable below is a S.I.D.E. project I worked on as part of Ms. Berbawy's POE class. To see more projects from our class, check out her website.
Raster-shaded games have been around forever. There are tons of optimizations and dirty tricks needed to make them look good. But through raytracing takes more graphics power, the concept and code are simpler. We don't need many tools to get this running. I’ll try to keep external libraries to a minimum. Of course, if there are alternatives you prefer (especially if you are not using C++) feel free to mix and match.
C++
It’s fast. It's popular, which means it comes with good support. It comes with support for OO, which is very useful for this project (be ready for a good deal of programming)
Here I use the MSVC compiler with C++17, though any other modern C++ compiler and operating system should work - what I do in C++ should have equivalents in most other languages.
I'll leave it to you to set up your dev environment of choice, though if you're looking to get started fast, read on.
CLion
When in doubt about how/what to set up, just get an IDE. I use Clion over alternatives like DevC++ or eclipse because it doe such a good job of solving your problems for you.
Github
Big projects need version control. You'll thank yourself later. If you're using Clion you can set this up with a few clicks.
Window Drawing Libraries
In Clion, you can set your C++ toolchain in Settings | Build, Execution, Deployment | Toolchains. It does a good job of auto-detecting if you already have C++ on your system.
Likewise, for Github (or Version Control of your choice), go to VCS | Enable VCS, and follow the GUI.
There are no graphics headers in the C++ library. Possibly, this is because C++ is used in so many contexts that it's impossible to have a standard graphics lib that would make everyone happy. Go get yourself the OLC Pixel Game Engine header (thanks, OneLoneCoder) and add it to your project directory. PGE is simple and works out of the box. You're free to use something else, like SFML or CINDER or Swing, or write directly to screen buffer if you are that kind of maniac.
Drawing to Window
The OnUserUpdate function in the sample OLC program (which you can find on its Github) is called after OLC clears the screen buffer and is ready for the next frame. This is when we can call Draw (double x, double y, olc::Pixel pixel color) to specify individual pixels. This is all that our barebones raytracer needs in terms of drawing operations; the rest will be in nearly pure C++.
As an example, here I loop across every row and column and draw a gradient that depends on the pixel coordinates.
Structure
For every frame of our raytracer, we
1. iterate over every pixel and send out a ray in the direction of that pixel. Each ray is a completely separate calculation; individual casts only share data about the objects in the scene and do not interfere with each other. (It is for this reason that raytracing is easy to parallelize. More on thread later)
2. The ray computes intersections with each object. If multiple intersections are found, pick the closest. If the ray intersects with nothing (goes off into space), then it will default to the background color and end the bouncing for this pixel
3. The ray calculates a bounce direction based on the intersected object.
4. The ray records the color of the object
4. Cast a new child ray in the bounce direction, get the child's color, and combine it with that of the original ray.
Casting a ray can be nicely wrapped into a function (ray_color()) that takes the direction and origin of a ray and returns its color. Some readers with a critical eye may notice that this cast function is recursive! That is, if the original ray hits an object, it spawns a new ray in the bounced direction and calls ray_color(). ray_color() may itself bounce and spawn yet another ray, calling ray_color() again. This recursion finally ends when a ray finds nothing to intersect with or reaches a user-defined limit of bounces.
Main Loop
I find it easiest to create a class of ray which holds a 3D point for its origin and a vector of x, y, z components for its direction. In a practical sense, both the origin and direction are just structures of three floating-point variables.
In my raytracer, the x and z axes lie flat on the ground plane. The z-axis points towards the viewer and the x-axis points sideways right, perpendicular to z. Y points up.
In the initial ray casting, the z component of the ray direction is constant; all rays point away from the viewer to the negative z-axis. The y and x components are proportional to the screen coordinates of the ray (see the first image).
Recursing for the Ray Color
I'll say it again; Ray Tracing in One Weekend is an amazing guide for writing a raytracer if you are looking for specific code. The second image uses many classes and techniques from RTOW. We represent the geometry as a hittable_list object, which contains a list of hittable objects. hittable is a superclass of whatever geometry is required, as long as it can provide a hit() function that computes a ray intersection with its own geometry.
In my code, I call hit() on my main hittable_list object, which scans through all the geometry looking for the one with the closest intersection. Then it puts the object's material and coordinate intersection within the hit_record struct and returns a boolean for whether an intersection was found (to get around the limitation of one return).
If an intersection is found, then scatter() is called on the object's material. This scatter() function primarily takes in a ray and computes its bounced result. The function defines materials; metal surfaces scatter() by reflecting, matte surfaces scatter() by returning a random angle, and glass surfaces scatter() by wither reflecting or refracting.
Scatter() also provides the attenuation, or surface color. The bounced ray is cast, and the final color is a combination of the attenuation and child ray's color.
Sphere
Sphere is a subclass of hittable(). As long as it
1. takes a ray, and
2. computes the ray's intersection point with itself(or return false for no intersection), and
3. computes the surface normal at the intersection (this is for the scatter() function in material to do its job)
it fulfills the role of being a hittable().
But you might say, Shawn, there seems to be an awful lot going on in your implementation (or RTOW's implementation?). Well, there are some practical considerations here, which you may or may not have in your code.
The first chunk solves the quadratic formula for sphere collision, up to solving for discriminant^2. If the discriminant is negative at this point, then there is no collision and the function can straight-up return false. Otherwise, finish solving the formula.
t_min and t_max are simple bounds that the solution must fall in, otherwise false can be returned. This helps with specifying that intersections must be in front of the camera, though you may choose to do this in another step of your program.
Finally, we load the intersection point and material onto hit_record to return. set_outward_normal is a trivial method that makes sure the normal faces towards to viewer.
Diffuse Material
Remember that ray_color calls scatter() on the intersected object's material to get the direction of the bounced ray. Now we will define the scatter() function for a matte material. Matte, or Lambertian, is a simplified representation of a "boring" surface; no reflections, no transparency, no refractions.
In scatter() we get the normal vector through a parameter, which was computed long ago in the hit() code. Then, a new ray set to the bounced direction should be returned. (as well as the material's color. This can be a field in the material class)
Matte surfaces return a random ray in the 180deg hemisphere around the normal.
1. Compute a random vector with each component randomly selected from [-1, 1]. (You may want to go the extra mile and ensure that the vector length adds up to 1 for physical accuracy)
2. See if the vector is in the hemisphere defined by the normal. This is true if the dot product of the random vector and normal are >0. If not, multiply all components by -1.
Metal Material
Metal, like Matte and all material objects, is also defined by its scatter() function. However, instead of picking a random vector in hemisphere, we reflect across the normal vector (see the image).
There is a very bad and naive, but intuitive way to do this by converting vectors into angles. The better way is to compute the reflected angle purely through vector math.
I - 2 * dot( I, N ) * N
Where I and N are the incident and normal vectors in the diagram. It is a little mind-bending, but you can plug n play this formula regardless of your understanding.
Glass
Glass refracts, and if refraction isn't possible, it reflects. Consider the refraction formula:
θ′ = arcsin( η / η’ sinθ )
θ is the angle of the ray direction from the normal vector (upon entry) and θ′ is the angle of the ray from the normal vector (upon exit). η and η’ are the refractive indices of (in our case) the air and the glass. 1.5 is a safe value for η / η’, though you can look up other values on google for various transparent materials.
Since the domain of arcsin is [-1, 1], any value of η / η’ sinθ that falls beyond this range forces the glass to reflect. This is why when looking at a lake at a shallow angle you will see the sky reflected at you instead of the lake bed.
Lights
This must be the easiest feature of all. In raytracing, many complicated seeming features just seem to just "work out" because we are simulating light rays rather than making assumptions about 3D, like in raster graphics.
There is already a light source in the engine -- the background, or the "sky." The last bounce of each pixel will always return the background color, and thus a part of this will be multiplied into the pixel color.
Go ahead, try to make the background black, or red, or green. The geometry will tint towards the background color. (for black, nothing will show up because there's no more light source!).
Add an emission color to the material superclass. This color will be 0, 0, 0 for non-emissive materials. Then, in the ray_color() function, get the emission color of the intersected object's material and add it to the final returned color.
Perlin
Finally, let's get some terrain going here. I use a simple approximation of terrain here; the ground is a 2D grid of squares, where each vertex of these squares has a height value. Each square has four vertices, and thus each corner of the square is at a different height.
OK, this doesn't make sense because you can't have a square meet 4 arbitrary heights. Instead, a square is two triangles, and each triangle takes 3 heights at each of its corners.
Conceptually, Perlin is easy. It's a single stateless function that takes a coordinate (in this case, x and y) and returns a height value. I found a nice header-only C++ library, Simplex Noise, that does this.
For the scene above, I added a metal ground plane for the "water" and changed the background to a yellowish gradient (the gradient color can be computed based on the ray's direction). I could have used a glass material for the "water," but metal looked better aesthetically.
Render Gallery
In one render I extend the width of the terrain by simply changing the bounds of the Perlin generator. Perlin noise is useful in that it doesn’t need to generate values sequentially -- you just specify any point in the world and it gives it. I could render any arbitrary portion of this mountain world without having to generate the rest of it.
The night scene demonstrates how the color of the sky (or any light source) can change the rendered color of the surfaces it illuminates. The background environment lighting is perfectly gray — no color tinting. So the colors of the water and hills are as I set them to be, standard green and blue.
Scenes with many reflections take tons of time to render. It's an ongoing problem in ray tracing because the extra bounces needed to render lights makes the rendering equation take a lot longer to converge. Photon mapping, MLT, and other methods exist to solve this, but they aren’t something I can do in the scope of this project!