Wednesday, May 1, 2019

Rendering Lots of Blocks Efficiently in Unity

The last few weeks I've spent trying to see how many blocks I can render, while keeping everything at a nice frame rate (30-60fps) with no frame drops. A minecraft type engine has a few rendering steps, as far as I know:


1. Culling: Only process the blocks that aren't completely surrounded by other blocks with cube models. The other blocks we can't see are "culled" away

By "cube models" I mean that they have the default block model that takes up the whole space. For example, if a torch is on a wall the block it is attached to should still display, but if I replace the torch with a dirt block I no longer need to render the block the torch was attached to (assuming it isn't touching any other air blocks)

By the way, for all non-culled blocks, only rendering the block faces that are visible (and hiding faces that are against a wall) can end up almost halfing your triangle count sometimes, which really helps! I highly recommend trying it.

2. For all the non-culled blocks, convert their raw data into triangles (or some other intermediate data of some kind)

3. Move that data to the GPU

4. Tell the GPU to render it


I had initially been doing step 1 in a compute shader (running one GPU thread per block, then appending the coordinates of all blocks that aren't culled to an AppendBuffer and reading this out later). If this seems like overkill, you're right. The reason I was doing it that way was because I borrowed that code from my VR Sand project. In VR Sand, blocks are changing very frequently, so I needed this whole process to be very fast. While it was, when I tried to scale up my world sizes, suddenly the size of my ComputeBuffers needed to be very large and the cost of managing them all became slow. I also couldn't easily increase the chunk sizes to something like 64*64*64 before my compute buffers started complaining about that being too many threads.

Also, I wanted to multithread things, and Dispatch can only be ran on the main thread.

So I figured I'd try moving my code to the CPU and see how the performance fares.

It turns out that it worked very well! The reason is this:

The main overhead I had was draw calls. and moving everything to the CPU allowed me to increase chunk size, and decrease draw calls


I decided to upgrade to Unity 2019, and the profiler is much more efficient. This was very nice, because before, if let Unity deep profile record for more than 200 frames (or at the very start when the world was generating). my computer would hang for 10-20 minutes, and not let me even bring up the task manager, so I'd have to restart my computer. With 2019 improvements, this is no longer a problem, making this whole optimization process much less frustrating :)

Anyway, so after moving steps 1 and 2 to the CPU, I needed to move the data to the GPU. I had three options:

Option A: Use the Unity builtin Mesh object
Option B: Use DrawProcedural
Option C: Use a native graphics plugin and manage the VBOs myself

I figured Option A made the most sense, so I made one Mesh object per chunk, and tried it out. The performance was decent, but I kept getting these unpredictable spikes occasionally. I looked at the profiler, and it had to do with Unity "deciding" when was the right time to make VBOs, Unity trying to cull things, and related graphics things.

For those that don't do, a VBO (vertex buffer object) is the object you make when you want to pass triangle data from the CPU to the GPU. Unity has a builtin Mesh object that tries to abstract that away from you. For most purposes, this abstraction works pretty well.

However, I needed more fine grained control over exactly when the VBOs were being created and moved to the graphics engine, because these lag spikes were a big pain in the neck. I kept trying to tweak settings and fiddle with stuff to convince unity to do things the way I wanted (spread things out over multiple frames), but Unity kept being annoying.

I looked into the details, and you can get a little lower level control here (Option C) if you manage the graphics objects yourself. However, that would lose me a lot of platform portability, and I find native plugins generally a pain to use (I've had to write a few in the past for other related projects), so I decided to try Option B first.

To explain Option B: Unity has a DrawProcedural method. How this works is that you do the following:

- Create a ComputeBuffer that has your graphics data


Then in the drawing loop, you can just do this

- Give your drawing material that buffer using drawingMaterial.SetBuffer("DrawingThings", graphicsDataBuffer);
- Call drawingMaterial.SetPass(0); on a material that has a shader you wrote attached to it
- Call Graphics.DrawProcedural(MeshTopology.Triangles, numVertices);

This works well, but it has some overhead. To address this, I tried making my chunk sizes 64x128x64. (they used to be 16x16x16). Once I did this, I was suprised to find that I could render tons of chunks while keeping a very good frame rate!

The key here is that in a blocky type world, once you do world generation and compute lighting, blocks don't change that often. This means that it is okay for us to take the few additional milliseconds in having to recreate the triangles of an entire 64x128x64 chunk every time a single block changes. And once you are done doing that, you just need to pass that data to the GPU, and then the overhead per frame is very low! (since there are very few draw calls)

To prove this, here is a screenshot of the world. My world generation is really simple right now, and I made the variability in height much higher than normal just to see what would happen. But yea, so that's my solution to rendering lots of blocks efficiently in unity :)


Friday, April 12, 2019

Efficient Pathfinding in a Minecraft type world

As I said in my previous post, I want fairly complex NPC AIs where they sorta feel like you are playing with someone else. I want to have 5-10 of them all doing various things. To do this, the first piece of code I need is fast pathfinding: code that describes how to get from point A to point B. My criteria for this pathfinding algorithm is:

- Fast: queries take just a few milliseconds
- Not very much preprocessing is required
- Intermediate results can be reused when possible (pathing is only slow the first time it is called in a new area)
- Works in a 3D minecraft type cube world
- When blocks are removed or added, it doesn't have to change very much

I did a lot of looking around, but wasn't able to find something that did what I wanted, so I just came up with my own algorithm. It has three key functions:

Generate pathing chunk: Given a chunk, generate a "pathing chunk"

This "pathing chunk" should abstract all the blocks in that chunk into "regions". Currently there are two types of regions:

- Walking regions:
  1. If a block can be stood on (is solid and has enough air above it), start a walking region (call this region H) at that block.
  2. Spread out to that block's 4 adjacent horizontal neighbors: if they can also be stood on and are in this chunk, add them to region H. Also spread out to the block below us and add it to region H, if it is solid and in this chunk. If a block is spread out to, repeat step 2.
      - If we are spreading to a horizontal neighbor (call this neighbor X) but we cannot be stood on, look up to find the block above us that can be stood on (call this block above us Y).  Make sure we can jump from X to Y, and that there isn't a block in the way. This prevents situations like this:
AB
XA
BY
BB

B is a block
A is air
X and Y are also air, we can't jump from Y to X (or vice versa) because the B is in the way above.

3. If our horizontal neighbor cannot be stood on and does not have a block below it that we can jump from it to us, make a "falling region" (call it region W) that just goes straight down until it hits a solid block, in which case it connects to the region there  Connect H to W, so we can go from H to W (but not from W to H: the connections are directional).


Connect two pathing chunks: Given two neighboring chunks, connect regions that can reach each other

I want to abstract this from the logic above, so this post will be cleaned up a bit in the future once I figure that out. That'll let me make more generic kinds of pathing regions such as "dig here" and such.

Pathfind: My pathfinding can now work like this:

1. Generate pathing chunk in goal and start regions (if needed)
2. Generate pathing chunk in regions adjacent to start (if needed).
Do A* pathfinding in this "metagraph" of regions connected to each other. When computing the cost to go between two regions, do A* inside those regions (this is pretty cheap since regions are usually relatively small). Caching can also help here so we don't need to repeatedly A* between regions.




And that's it :) You can find the code that implements this idea here. It's fairly simple conceptually: just make a "metagraph" of "traversable regions" and then pathfind through those. The details are a little tricky and I'm still fiddling with ways to clean this code a bit. Ideally I'd like to seperate the logic of "connect two chunks" from the logic of spreading and creating regions, because that'll simplify things and make it much nicer to create regions, but that's currently in the works.

The nice part about this is that it satisfies all my criteria: when a block is modified, only that pathing chunk and the connection to it's adjacent chunks needs to be updated. Pathing chunks are fairly quick for me to make, and once they are complete then pathing can be done over hundreds of blocks very quickly :)

I considered doing something hierarchical, and perhaps I'll do that in the future, but as is the performance of this was good enough that I didn't really see a need for that.





Status so far

This project actually started around November of 2018, but after spending 4-5 months on it I decided it might be fun to write a blog so I can better document my progress, and share any insights I've gained during programming.

Here is what it looks like so far:

Currently, it has:


- Lighting:

I haven't implemented "smooth lighting" yet, but otherwise I'm pretty happy with this.

I have a "heightmap" thing that keeps track of the highest solid block at each x,z coordinate, and gives that block sunlight. This sunlight will spread out: if not touching the sun, a block's sunlight = (max of neighbor's sunlight values) - 1, clamped to have a minimum value of 0 and a maximum value of 15.

There is also a "block lighting" value that trickles in a similar way: If it is not producing light, a block's blocklight = (max of neighbor's blocklight values) - 1, clamped to have a minimum value of 0 and a maximum value of 15.

The shader has both these values, and also has a value for how "sunny" it is (from 0 to 1) that is multiplies to sunlight. In pseudocode


sunLevel = (value from 0 to 1, 0.1 at night, 1 at noon, can smoothly change at sunset)
sunLight = (value from 0 to 15)
blockLight = (value from 0 to 15)

actualLight = max(sunLevel*sunLight, blockLight)

this lets me not have to run lighting updates when it gets dark at night. I also used "tinting" so in dim lighting things get a blue shade similar to how this happens irl, due to the biology of our eyes. There isn't actually a daylight cycle yet, but there is a global "sunLevel" variable that I can change to make it look like night.

- Raycasting: I wrote code that raycasts to find when your mouse hits a block. It also raycasts to let you move around, so even if you are falling at a very fast speed you won't fall through blocks.

- Hold down shift so not fall off edges (like sneaking in minecraft)

- Block Entities and player inventories. There can be stacks of items, and items can have durability and break

- Crafting and custom recipes

- Fairly basic generation: I have caves using a perlin worms algorithm and basic hills just using perlin noise, but this could be vastly improved and is probably going to be my main next focus. I'm still tweaking with what makes a nice API here.

- Rendering: I have 16x16x16 chunks (this size might change). My world is infinite horizontally and vertically, so you can dig down as much as you want and never hit bedrock. Today I moved the rendering code to a seperate thread which makes everything feel much smoother. I don't show block faces that aren't exposed to air.

I also have a system for custom "block models" where using a format very similar to minecraft's format. This allows me to make things like this (ignore the gross textures this is just an example)


My physics engine/raycasting will actually look at the small models themselves (by raycasting to each triangle) and not just treat them as a entire block, so you get nice behavior like only breaking a torch if you click on it, and being able to breat the block below it.

- Saving: Words can be named, saved, and loaded (but the format is pretty gross right now and takes 10 seconds or so to save and load a world - I plan on improving this in the future)

I'm desigining everything to be very modular and nice to use (in terms of coding), so adding new recipes and blocks with custom behaviors is very easy to do.

- Pathfinding: I'm really happy with my current pathfinding system, it can pathfind very quickly around hundreds of blocks efficiently, but I think it'll require a seperate blog post to explain in full. There is still some things I want to do with this to enable mobs to strategically build and such as part of their pathing. I really like games like dwarf fortress and rimworld, and I'd like to have much more complex NPC AI. For those of you that follow Tango Tek's Villager Mod series, that is more along the lines of kind of gameplay that I am going for: You are a player yourself, but you interact with NPCs quite a bit to do things in the world. Except, I also want the NPC so be builing things as well with you. I want them to feel like someone you are playing with.

- Multiplayer: I don't have this yet, but this is very important to me and is a big todo in the near future.






Introduction

I've always thought voxel type games were such a cool idea. Many years ago (around 2010) I wrote basic marching tetahedrons code and was tinkering with a basic engine, but it never got very far. A bit after that, I learned about Minecraft. This was back in the Alpha days, but I still loved that game so much. It did so much of what I wanted in an open world creative building time game, and it was so exciting to see how well a game like that could actually work.

Ever since then, I've tinkered with making cube type engines from time to time, but never got very far. Recently, I decided to create VR Sand (a falling sand/powder toy type game), and my design decisions ended up leading me to understanding quite a bit about how to render large amounts of cube efficiently. I realized that similar code could be used for a blocky type engine (in Unity), and decided to get started. This blog will be documenting the progress on making a game/engine of sorts. I don't really have any major goals, this is just something I really enjoy making so I figured I'd see wherever it goes. My code is all open source, you can find it at https://github.com/Phylliida/Blocks


Rendering Lots of Blocks Efficiently in Unity

The last few weeks I've spent trying to see how many blocks I can render, while keeping everything at a nice frame rate (30-60fps) with ...