Boosting Starling 1.3 performance by 50%

It’s been a while since my latest Starling blog post. The Starling version I was tuning back then was 0.9.1 while the current one is already 1.3. During that time lots of nice features have been added to the framework including some performance optimizations but anyways I thought now would be a good time to check if there is still something to improve on the performance side. So let’s now start optimizing Starling 1.3.

Optimizing DisplayObject

The first place to start tuning is DisplayObject class and its get transformationMatrix function. There as the first step the transformation matrix is always set to identity matrix. Then there are lots of if clauses whether the matrix should be scaled, scewed, rotated, translated or modified because of the pivot point. It is pretty clear that these operations most likely override the initial values that were achieved by setting the matrix to identity matrix.

So how could we improve the performance here? Let’s first divide the function into two cases – first a case where the display object is skewed (this is most likely not too common case) and then to a case where it’s not. For the skewed case we can use the original implementation but for the other more common case we do things a bit differently. First we stop setting the matrix to identity altogether. Instead we check if the display object is rotated and if not we simply set the matrix to do proper scaling and transformation like this:

mTransformationMatrix.a = mScaleX;

mTransformationMatrix.b = 0.0;

mTransformationMatrix.c = 0.0;

mTransformationMatrix.d = mScaleY;

mTransformationMatrix.tx = mX – mPivotX * mScaleX;

mTransformationMatrix.ty = mY – mPivotY * mScaleY;

With this implementation there is no need to compare any of the values against zero or one and there is also no overhead from using function calls. Basically this is about as fast as just the call to the identity function that we removed and we have the final matrix now. For the transformation there is really no point checking if the values are zero or not since in most of the cases they are not and then the if check will just add one extra step to the execution. It’s also a good thing to keep in mind that adding zero or multiplying by one doesn’t change the result.

For the case where the display object is rotated we calculate cosine and sine for the angle and then do the matrix math by ourselves by setting transformation matrix’s a to mScaleX * cosA, b to mScaleY * sinA, c to –mScaleX * sinA, d to mScaleX * cosA and tx and ty the same way as with when the display object was not rotated. Again there is no need to check if values are different from one or zero to avoid the extra steps. For the pivot handling we divide the original if clause into two parts first checking if pivotX is not zero and then pivotY. Here the amount of operations possibly avoided by the check justifies its cost.

With these simple changes the Starling benchmark scene should be able to handle 5-10% more images.

Optimizing VertexData

The next class to start tuning is VertexData. It has copyTo and transformVertex functions that get called whenever a quad is added to a quad batch. Here we apply similar idea as in the previous step so instead of first copying and then doing the matrix transformation for the values we pass the transformation matrix as a parameter to copyTo function and assign translated values to the target data.

This change should again slightly improve the performance.

Next change is the biggest one and will also have the biggest effect on performance. Starling now has this tinted parameter telling if the quad or image will use coloring or is partially transparent but it’s not really used for anything else than selecting correct fragment shader. Since once again probably most of the images will not use coloring nor be partially transparent the color data should not be copied between vertex data instances or sent to the GPU when updating the vertex buffers. To achieve this optimization we divide the mRawData vector in VertexData class into three separate vectors – one for position data, another one for color data and third one for texture data. After all the changes in VertexData class it’s also necessary to have three vertex buffers in QuadBatch to match the three vectors.

When we start passing tinting parameter to VertexData class copyTo function we can copy the color data only if the tinting is in use. The same logic can be used in QuadBatch class syncBuffers function so that the color vertex buffer is updated only if tinting is used. For the cases where tinting is not used this halves the amount of data first copied between VertexData instances and later uploaded to the vertex buffers.

After these changes you can expect around 40% improvement to the image count the Starling benchmark scene can handle. The improvement should happen both on desktops and mobile devices.

Optimizing event handling

If you still want to improve the performance the next place to start tuning is the event dispatching. For example on each frame update when Stage class advanceTime function is called it will iterate through all the display objects to collect a list of those that are actually listening to the enter frame event. Since most likely only couple of your thousands of display objects are actually listening to this event a more efficient way to do this is to have list of these display objects in Stage class. This can be achieved by making the display objects report through their parent all to way up to the Stage addition and removal of listened event. You will also need to modify the function for setting the display object’s parent a bit so that the changes are properly reported to Stage.

With this additional change you can expect 50% improvement to the image count handled by Starling benchmark scene. In my opinion this is quite a remarkable achievement.

Rectangle packing

Something for example Starling framework is still missing is a rectangle packing utility with which you could generate a texture atlas on the run-time. After some Googling I didn’t come across with too good examples so I spent couple of hours to write one of my own. Since this is a freetime project of mine I am this time also releasing the source code.

Rectangle packing

The idea in rectangle packing is to place smaller rectangles inside a bigger container rectangle as tightly as possible. This is especially useful when generating big textures containing many sub textures. My implementation uses the concept of “free rectangles” within the main rectangle. The packed rectangles are always placed in the top left corner of some free rectangle that they completely fit into. To get very close to optimal packing the top most of the left most free rectangles the packed rectangle fits into is selected for placing.

The algorithm

Initially there is naturally only one “free rectangle” that is the main rectangle itself. After packing the first rectangle in the original free rectangle is removed and there are from zero to two new free rectangles – if the packed rectangle is as big as the container there are no more free rectangles, if the packed rectangle is as wide or as tall as the container there is one free rectangle either below or on the right side of it and if the packed rectangle is smaller there is one free rectangle below it and one on it’s right side. Packing next rectangle happens the same way – also any other free rectangle the packed rectangle intersects is cut into new smaller free rectangles around the packed rectangle. During the process all the free rectangles that are fully contained by another free rectangle are removed.


The image below shows how the free rectangles are combined. The image on the left side shows that there are two free rectangles after placing the first rectangle. Placing the second rectangle would divide the free rectangle “2” into two new free rectangles (on the right side and below) and also the free rectangle “1” into three new free rectangles (above, on the right side and below) but here two of these new free rectangles are completely contained in the bigger free rectangles so the total amount of free rectangles after packing the second rectangle is three.

To see this rectangle packing in action click the image below. Drag the orange circle at the bottom right corner of the container rectangle with your mouse (keeping the left mouse button down) to see how packed rectangles move around the area. With a decent computer the packing of 500 rectangles takes about 1 millisecond.

You can download the full source code for the demo GitHub.

Like the copyright notice in the source files says you may use and/or modify the source code freely but do not remove the copyright notice or move the files into other packages. If you find the utility especially useful you can mention me in credits too.

Update: The version available since 22nd of August 2012 almost 10 times the speed of the original one.

Angry Birds Heikki is out!

Hello to all Angry Birds fans again!

The highly anticipated Angry Birds Heikki went live yesterday. The game is inspired by and themed around Heikki Kovalainen. Currently there is a level called “Silverstone” open and the remaining eleven levels will keep opening together with the progress of Heikki’s season. The game uses again Flash 11 technology to provide everyone a smooth gameplay and has some really nice levels and graphics. Check it out now and also keep checking it later when more and more levels are unlocked!

Remember also to keep checking Heikki’s Facebook page to get the power up codes. With the codes you unlock a special Heikki version of the Terrence which you can use in all the currently open levels for one week. I hope you all have nice time playing this game!

Angry Birds goes Coke

Hello Angry Birds fans! There’s another branded version of Angry Birds out there. This time it’s for Coca-Cola China and now in addition to traditional game mechanic the players are collecting drum beats by hitting drums that are placed on levels. By collecting these beats you will both unlock two levels of your own and also help to reach the community goal which will again unlock three new levels. The levels in this game and especially the locked ones are pretty cool so I recommend all the Angry Birds fans to try the game!

Since the game is in Chinese here’s short instructions how to register and login:

In the registration screen you need to provide your email, new password for this service and a nickname to be used on the leader boards.

In the login screen you login with your email and password.

Hope you all will enjoy this game!

Angry Birds top rated game in Facebook

After Facebook launched its appcenter it was nice to see Angry Birds Friends to be number one in the top rated games with a rating really close to five stars. Big thanks to all you Angry Birds fans out there! It’s your positive feedback that makes the day for us game developers. Also if you haven’t yet tested the recently introduced “weekly tournaments” now is a perfect moment for that!

Back from Multi-Mania

I spent three really nice days in Kortrijk, Belgium in the Multi-Mania 2012 event in the beginning of this week. My own presentation about bringing Angry Birds to Facebook was on Tuesday. Since I already had one Angry Birds presention in Adobe MAX last year you might not expect me to have been too nervous about having another for audience consisting mostly of students but the catch here was that there was more than 1000 people in the fully booked main room. Anyways I believe that it went pretty well after all even if I still have a lot to improve with my (English) speaker skills. The keynote Aral Balkan had certainly gave a lot to think about also about how to present things. I’m awful singer so can’t beat Aral’s opening act but I’ll try to have 1% of that energy next time I’m in front of an audience.

It was also really nice to meet lots of other speakers both before and after the presentions. It was especially nice talking with you Michael, all the best with the Delta-Strike release coming in couple of months! And then the fans. It seems that Howest university has some of the most hardcore Angry Birds fans on this planet. I’m really happy having had the possibily to be talking with so many of you and also getting the chance to eat best Pitta in whole Kortrijk. 😉

Finally it’s time to thank Koen and Angelo. Thank you for inviting me to Multi-Mania. The event you are organizing is simply amazing. It was wonderful being there and if anyhow possible I’ll try to be back. No, I’ll rephrase that: I’ll be back.