As I’ve added more and more features to the Mood Flip code, the time it has taken for a test build to load on my old iPhone 4S relic has also gradually increased. Although I’ve tried to avoid the trap of “premature optimisation” for as long as I could, when a recent build basically tripled the average load time to 76 seconds (counted as the time from tapping the icon until the moment the main menu comes up and can be interacted with) I knew I had to do something about it.
For reference, creating a fresh Unity iOS build, with a basic scene that’s nothing more than a cube and a camera, loads in 7 seconds on my phone. In my own experience of playing other mobile games, if the time-to-first-user-interaction is less than ten seconds then it feels pretty quick. Between ten and twenty seconds start to feel as if the game is unoptimised. And anything more than 20 seconds starts to induce frustration. I love Plants vs Zombies 2, for example, but every single time I start it up, I feel ever so slightly annoyed by how long it takes. And I’m a pretty patient gamer – how many other players, it makes me wonder, has PvZ 2 (and other similar games with long time-to-first-interaction delays) lost over the years just from people becoming impatient at seeing that rolling lawn loading animation for anything up to a minute before they can even tap an option? Of course part of the problem no doubt is that my phone is an absolute dinosaur, but still: a guiding principle, in my opinion, is that you should always design your app to minimise any avoidable pain points that risk causing your customer to give up on your game. My ideal target then is sub-10 seconds on my 4S. More recent iOS models, which the target audience for the game are likely to have, should in theory start up even more swiftly. (Android and its much larger diversity of specs will be a whole different kettle of fish, but I’ll tackle that when I get there.)
The Unity profiler is certainly a handy tool, but it can only tell you so much about what’s wrong. For this, I had to go old-school. I created a “tuning” copy of the whole project, built it to the phone and timed it, and then set about removing things one by one. Plugins were the first to go, but only accounted for a small fraction. Then I set about applying “stand ins” for various large textures and audio files – this is basically where you replace say a 1MB MP3 file with a placeholder of the same type that might only be a few kBs. That build managed to shave off about 20 seconds from the load time. I had already known that at some point I’d have to move those out of the player runtime at some point – more on what this means below – but it was good to finally get an empirical measurement of what that roughly amounts to in real world timings.
But still there were 50-odd seconds of bloat that shouldn’t have been there. After much chopping and rebuilding, I finally tracked down the main culprit. Early on in my startup code, I run a function which pre-calculates all of the UV tile vectors for all of my models, and all of the vertex vectors for all of my grids. These calculations get done up front so that the game itself can simply look them up rather than calculating them on the fly, which is much faster. The downside was that doing these in advance was impacting the initial load times. I had recently added a new level of detail for one model that pretty much amounted to an extra 3,000 calculations, and my poor old three-year-old phone didn’t have the grunt to keep up.
After some not inconsiderable head-scratching, I came up with a solution. I’ve created a new scene in the project that isn’t part of the final build and the job of which is just to perform all of those calculations at design time and then serialize the resulting arrays of vectors to a binary file of bytes. I then used Unity’s AssetBundle feature to store those files of bytes in a bundle, and placed those bundles in the StreamingAssets folder. All up, they only sum to about 50kb of extra data, but by pre-calculating all the maths outside of the running game and saving the results ahead of time, that’s removed the one most intensive task that the startup process previously had to do.
Next I created an abstract Bundle class which just loads the passed-in bundle into memory and then calls an override function in the concrete class to unpack it, waits for that to finish, and then unloads the bundle from memory. Now here’s the clever bit: the job of the unload function in the concrete class is to asynchronously unpack and/or deserialize the bundle’s contents back into whatever original type format is required – in this case, converting the bytes back into arrays of vectors. Once that’s done though, the calling code can now access that data through the properties of the concrete class that the bundle was unpacked into – there’s no need to pass or copy the data back to the calling code, which would just be unnecessary extra work. I just replace the old variable I was previously filling up on the fly with the new bundle property that now holds exactly the same data that I’ve loaded from a file instead. There’s a little more to it than that obviously, but in effect this means that all I need to do is create a “new” variable of the type of the concrete class, yield until it’s loaded, and then use it. So now with all my caching calculations loaded in this way rather than grinding through them at runtime, the current load time stands at 26 seconds. The lesson to be taken from all this, then is: whenever possible, offload any intensive mathematical functions that will always produce the same results to design time, and cache the output into a file; after doing this, loading the needed data back into memory through an AssetBundle or some other file operation appears to be considerably faster than processing the maths on the fly on one’s much slower mobile device.
I still have more work to do, but the hard part is done now that I know where the bottlenecks are and now that I have a grasp of AssetBundle usage and some custom classes to simplify loading them. In my tuning copy of the project, after I stripped out all those media assets and turned off the calculations temporarily, I was getting load times of around 8 seconds, which I found quite astonishing. Now that I know what else needs to be relocated, I am fairly confident that within the regular project I can now get pretty close to my 10-second par. Hopefully within the next few days I’ll get round to organising all the necessary bundle assignments and writing loaders for them.
A quick explanation here to address the runtime size point I made earlier, which may hopefully assist other developers. Within a Unity build, any media assets that are referenced within the scene (e.g. a material that you’ve referenced in a public inspector variable) or that are stored in the Resources folder (which apparently is essentially a kind of internal AssetBundle) will get included in the player runtime executable. A handy way to get a feel for what’s in your build is to right-click on the Console window in Unity after a build, look at the Editor Log, and scroll down about 2/3 of the way down the log to see a list of every asset and script included in the build, from largest to smallest, and an overview of the percentages used by different types of assets. Here’s an extract of what I was seeing at first, for example:
Textures 15.2 mb 54.9% Meshes 245.8 kb 0.9% Animations 8.5 kb 0.0% Sounds 4.4 mb 15.9% Shaders 255.4 kb 0.9% Other Assets 287.7 kb 1.0% Levels 480.5 kb 1.7% Scripts 1.6 mb 5.9% Included DLLs 5.1 mb 18.4% File headers 98.5 kb 0.3% Complete size 27.7 mb 100.0%
Note the big ticket items in bold – textures, audio, and DLL libraries alone account for 89% of all the extras. In total then, that 27.7 MB complete size of assets included in the runtime is on top of whatever the size of the base Unity executable is, all built into the runtime. When an app runs, first of all the phone needs to load the whole player runtime into memory before the opening scene itself can even be started, and that’s not quick on such low-powered devices – the size of an app’s initial runtime executable therefore will directly impact load times in noticeable measurements of seconds, not milliseconds as one might be used to on a PC. So when trying to improve overall runtime load times, it’s good practice to defer as much of this resource-hogging as possible to later, by minimising the size of the main executable and splitting out assets to a separate set of data files that the runtime can pull in later as needed. In my tuning tests, simply replacing about half of the items in “Textures” and “Sounds” got the runtime load down from 26 seconds to 8 seconds, so there’s a pretty clear advantage to getting those out of the executable.
Now obviously some of that time saved will inevitably be incurred again, just later on in the flow – I’ll still need to load those same textures and music files back into memory and hook up their references again; now though it’ll be as post-menu-load AssetBundle loads rather than being baked into the player. But the main thing is that I can now do this on my own terms and using asynchronous loading. If I can get that main menu up and running within the first 10 seconds, then the user will have something to interact with and a positive User Experience feeling that “this game loads fairly quickly”. Meanwhile, behind the scenes, all my other manager classes are still feverishly loading in all those remaining assets that the game itself needs to run, but which were not required for the menu to be shown. By the time the user has clicked Play and the menu begins fading to black, with any luck the level data will be loaded, the necessary materials for the game objects and the skybox will be in memory, and the level music will be in place. And if the device is still grinding away? That fade to black might have to stay on a little longer. Overall though, the rule of thumb is to try to pace out the loading of assets to prioritise the most vital elements needed at any one time to satisfy the user’s expectations of responsive interaction. That first impression counts, and needs to be carefully managed. And then once everything is loaded, the rest of the play session should all run silky smooth after that, give or take the occasional need to shuffle various heavy components in and out of memory to keep resource costs down (such as music tracks that aren’t being played yet, or textures for Moodies the user hasn’t unlocked yet, for example), again employing the same guiding principle of planning out the loading process to cause as few delay spikes for the user as possible.
So there you go. It took me a good week of frustrating iterations through the very slow Xcode build process, trial and error, and scouring the Internet for tips, to come up with all of that. But I think it’s been well worth the effort. All it takes is 20 seconds of so of aggravation over a slow load for a user to feel like your game isn’t worth the wait, whereupon in frustration they exit the app and delete it, and you’ve lost a player. For all the talk of churn and retention I read about on sites like Gamasutra, I get the feeling that as well as considering the specific mechanics of making the actual gameplay strive to maintain user interest, it’s also important to keep in mind the general time-poorness and limited patience many people have these days as well. The difference between a game that loads almost right away and a game that forces you to endure a 30-second loading screen may well be crucial. I’d certainly be interested to know just what the statistics are for “this is taking forever to load; I’m giving up on this rubbish” churn levels where users have dropped out purely due to impatience rather than dissatisfaction with a game’s content itself. I’m sure the big analytics companies probably know these figures, but anyway, all I can do is go with my gut instincts and make the assumption that it’s in my best interests to keep load times low.
Now if you’ll excuse me, it’s time for some PvZ 2… Soon… In a little while… There you go.
Bye for now!