IP MASQ shortcuts causing Grapple to fail

This article was originally published on the LGP Blog on Saturday, July 4th, 2009

This is a technical article that goes to some depth into networking protocols. Reading this article, you will need some overall knowledge of how UDP networking works, and a basic idea about what NAT and IP MASQ is.

We’ve just spent quite a while investigating issues with a small number of games failing to make connections to perfectly good servers. Grapple, the LGP networking layer, is designed to work around the presence of firewalls, NAT, and all that good stuff, so that users can host a game from any machine on their network, whether it be exposed to the outside world or not.

So, when we found out that some people couldn’t connect to games, we were a bit concerned. After all, our code should be just fine for this. It has worked in all of our test scenarios.

Unfortunately we assumed that all networking at the kernel level worked. This was a mistake.

So, take a scenario, lets say host A (lets call it gateway) is the gateway server, running IP MASQ to NAT all servers on its local network. Host B (call it laptop) is a machine on the internal network.

Now, as we run through this article I will run you through some aspects of NAT, STUN, TURN, and other nice things about firewall and NAT handling. I wont be going through the exact protocols, but I’ll give you a bit of an idea how it works.

So, Laptop is the machine I want to use to start up a game server. I start a server and bind a UDP socket to port 2000 on all interfaces. I then use STUN to find out what kind of connection I am on. The method for STUN is shown below.

STUN tells me that the server is on a symmetric NAT. This means that the only things that can communicate with the server, are things that the server has talked to itself first.

This is the most restrictive type of NAT, under usual circumstances it will mean that it is effectively impossible to make connections to this laptop game server without making manual traffic routes through your NAT routing table on the gateway machine for each client. Any client connecting to the server will try and connect to the laptop, but any routing will be rejected because the laptop has not first connected to the client.

However, Grapple can handle this, as I will demonstrate.

So, on the gateway machine (it is important for later on to note that the gateway machine is also the client trying to connect to the laptop server to play the game), I start up a client. The client tries to directly connect to the server, sending a number of UDP connection requests. The server does not receive them.

This is not unexpected, as demonstrated earlier, the server is behind a symmetric NAT and the server hasn’t sent any data to the client, and so the servers routing will reject anything being sent to it from the client.

So, at this point it looks fairly hopeless, how can the server magically know the client is connecting to it.

Well, the answer lies in the STUN server. The laptop server has already talked to the STUN server and so the NAT routing tables allow the STUN server to send more data to the laptop.

So, the client gateway knows about the same STUN server, and so sends a request to the STUN server to ask the laptop server to connect to it.

The laptop receives this request, and connects to the address that the STUN server requests. The client should then, we hope, be able to communicate.

Now at this point this can go two ways.

1) Client is not behind a symmetric NAT or firewall

In this case, the client receives the data packet, and communication can start. We’re pretty much done handshaking, the client and server can talk to each other.

2) Client IS behind a symmetric NAT or firewall

The client will never receive the packet from the server. No communication is possible.

Now, in the instance in question, we discover the situation is an impossible one. The client receives the packet from the server, but then is unable to reply.

This is completely impossible, the routing table to the client has seen a packet to from the laptop to the gateway, and so should allow 2 way communication between the two machines (on the same ports). But for some reason – it wasn’t!

This was the sticking point. How could this be possible. We knew that the laptop worked this way because we had communicated with the STUN server successfully, and this relies on the fact that an outbound packet creates a tunnel back to the server. This is vitally important for all networking, or you would just send out data and never get anything back, rendering NAT pretty much useless.

The issue turned out to be the IP MASQ NAT on the gateway.

The laptop sent a packet to the external IP address of the client. This means that it should have gone out to the NAT layer, and been NAT’d to the external IP and port of the server socket communicating to the client. The client would have received a request that external IP 123.123.123.123 was trying to talk, and would respond to that address. The response would have gone through the same IP MASQ layer, and the server would get back a response from the location it expected.

HOWEVER

IP MASQ has decided in its infinite wisdom to take a short cut. It spots a packet will be routed to the gateway, and decides that, hey, they are on the same internal network. They can talk directly! And so, the packet presented to the client is not showing as coming from the outside internet, it is shown as coming from 10.2.2.13 (the internal IP address of the laptop). And so, when it gets this reply, it sends a response back. As the response never passes through the IP MASQ because it is going over the internal network, then the response to the server comes from 10.2.1.1 and as the server sent its data to 123.123.123.123 the servers own firewall (independent of the NAT) says ‘hey, I don’t know you’ and rejects the packet.

All because IP MASQ takes a short cut.

In the end, the problem is solved that if the client doesn’t receive further communication from the server, then it falls back to its final communication option, TURN. We have already demonstrated that the STUN server can talk to both sides, as both sides initiated communication with the server. This means that the STUN server can act as a relay between the client and the server, sending any and all data via this third party. However to be honest, we don’t like doing that as obviously it will increase latency of games, and also it means that any and all game traffic that has to resort to TURN has to be passed via our servers. A few games running at once, no problem, but more and more games means more and more bandwidth used, and we’d like to avoid that!

So in the end, the problem is solved, but really, if not for IP MASQ making a shortcut for efficiency, there would never have been a problem in the first place.

[Edit: I am adding the layout of the network below, to explain a little better]

Variable length C macros

This article was originally published on the LGP Blog on Saturday, May 30th, 2009

Something that came up in porting X2 and X3 is that Visual Studio in Windows handles variable length macros, and as far as we could see, GCC didnt.

This means that in gcc, if you wanted to create a define that would do a job and report where it was called from, you would need to do something like:

void mydebug(const char *file,int line,const char *fmt,...)

#define DEBUGOUT1(x) mydebug(__FILE__,__LINE__,x)
#define DEBUGOUT2(x,y) mydebug(__FILE__,__LINE__,x,y)
#define DEBUGOUT3(x,y,z) mydebug(__FILE__,__LINE__,x,y,z)

and so on…

However, there is a solution which does the job, which we found after a fair amount of investigation. The recommended method would be

#define DEBUGOUT(x,y...) mydebug(__FILE__,__LINE__,x,y)

Which will work, but only if you actually add a second parameter to the DEBUGOUT call. Without, it will expand

DEBUGOUT(x)

to

mydebug(__FILE__,__LINE__,x,)

which will obviously fail to compile. To fix this, simply bear in mind that … in the macro just means everything else, so you can do

#define DEBUGOUT(x...) mydebug(__FILE__,__LINE__,x)

which then works with single and multiple values in macros, x becomes the fmt AND the … for the mydebug .

Being sure of your shared objects

This article was originally published on the LGP Blog on Wednesday, April 15th, 2009

This short article is all about how to be sure you know exactly what your application is doing with shared objects.

It can be quite easy sometimes to not know which objects are being used by a program. Sure, you can ldd a binary and find out what it links to, but which order are they being used, and what about when the program dlopens something, as many do. How can you easilly find out what it is using and how it finds it?

Simply, you can run an application as such:

LD_DEBUG=libs ./yourapp

This will give you pages of output telling you everything that the application is doing with shared objects, and you will note that each time it opens a shared object, it will report this with ‘init’, leaving you able to simply grep to see which .so’s it is using.

Hopefully this helps people that are trying to work out just quite what their apps are doing sometimes!

A little-known trick with C switches

This article was originally published on the LGP Blog on Tuesday, March 3rd, 2009

OK, so this little programming post is nothing like rocket science, it really isn’t. It is, however, REALLY useful, and something that almost nobody knows about the C language. We came across it when we were porting a game a few years ago, and none of us had ever seen it before. We tried it, and it works!

So, lets take a C switch statement

switch (somenumber)
{
case 1:
case 2:
case 3:
  do_something();
  break;
case 4:
case 5:
case 6:
  do_something else();
  break;
}

Fairly standard, nothing too unusual here. However there is another way to do it!

switch (somenumber)
{
case 1 ... 3:
  do_something();
  break;
case 4 ... 6:
  do_something else();
  break;
}

Unfortunately, it doesn’t seem to work on anything other than C/C++, which is a shame!

Enjoy!

How to search a data tree efficiently

This article was originally published on the LGP Blog on Monday, February 23rd, 2009

This nice little method is one we used for PenguinPlay.

Assumptions: You know some C, and understand tree structures

If you have a whole load of data, lets say a hierarchy of categories with items in them, like in an online store, it is common to store this data in a tree. This allows you to move up and down the tree logically to access the data in the same way it is visualised.

With the logical way of doing this, there comes a problem of scalability. How do you find all of the categories that are below a certain point in the tree. Lets take a tree as an example

            +----(2)Windows
(1)Games----|               +----(5)Strategy
            +----(3)Linux---|                       +----(7)Action
            |               +----(6)First Person----|
            +----(4)Mac                             +----(8)RPG

Now to find all categories that are children of Linux, how do you do this?

Well, most of the time, the simplest way to write some code for this will find all of the categories that are children of Linux (5,6) and then for each of these find all of the children of these (7,8) and so on and so forth down the hierarchy.

This isn’t too inefficient in smaller trees, but imagine having ten levels, each with five branches at each level. You very quickly get an impossible number of items to deal with using this method.

So, how do we do this efficiently?

The solution is quite simple. It takes a moment to understand, and then it just makes you think ‘Why didn’t I think of that!’

Initially lets draw the above example again, but in a different way

+----------+
|          |
|          +------------+
|          | (2)Windows |
|          +------------+
|          |
|          +------------+
|          |            |
|          |            +-----------------+
|          |            | (5)Strategy     |
|          |            +-----------------+
|          |            |
|          |            +-----------------+
|          |            |                 |
|          |            |                 +-----------+
|          | (3)Linux   |                 | (7)Action |
| (1)Games |            | (6)First Person +-----------+
|          |            |                 |
|          |            |                 +-----------+
|          |            |                 | (8)RPG    |
|          |            |                 +-----------+
|          |            |                 |
|          |            +-----------------+
|          |            |
|          +------------+
|          |
|          +------------+
|          | (4)Mac     |
|          +------------+
|          |
+----------+

As you can see, if you look at it, the layout above shows the exact same data that is shown in the first tree, just displayed differently.

So, now onto how this helps us. Firstly, take your first category, (1)Games. We assign each category a minimum and maximum number. As we are starting, we use 1 and 2

1 +---------+
  | (1)Game |
2 +---------+

Simple enough. Now, lets add the first child, the (2)Windows. Now, as this is a child of (1)Game, then its minimum and maximum must be inside of the min and max for its parent. So, we simply change the maximum of (1)Game to 4, and set the min of (2)Windows to be 2 and the max to be 3

1  +----------+
   |          |
2  |          2------------+
   | (1)Games | (2)Windows |
3  |          3------------+
   |          |
4  +----------+

Now, lets add in category 3 and 4, in each case, we insert the categories with min and max values that are inside of the parents, and not overlapping with any other children of that parent. Like so.

1  1----------+
   |          |
2  |          2------------+
   |          | (2)Windows |
3  |          3------------+
   |          |
4  |          4------------+
   | (1)Games | (3)Linux   |
5  |          5------------+
   |          |
6  |          6------------+
   |          | (4)Mac     |
7  |          7------------+
   |          |
8  8----------+

Simple enough, but what does that get us? Not a lot yet. Lets go a bit further down the tree, and add some children to the Linux category, and see what happens.

So, using the same rule, any child we add must have a min and max inside its parents, and its parent must expand to accommodate it. Lets add (5)strategy in as a child of Linux.

1  1----------+
   |          |
2  |          2------------+
   |          | (2)Windows |
3  |          3------------+
   |          |
4  |          4------------+
   |          |            |
5  |          |            5-------------+
   | (1)Games | (3)Linux   | (5)Strategy |
6  |          |            6-------------+
   |          |            |
7  |          7------------+
   |          |
8  |          8------------+
   |          | (4)Mac     |
9  |          9------------+
   |          |
10 10---------+

Now, by adding the child to Linux, you have to change lots of mins and maxes. It looks like it would be a real pain to keep track of, but simply, you can just go through a flat list of all categories, and add 2 to each min and max that are greater than or equal to the old maximum of the parent. So in this case, the Linux category min/max were previously 4 and 5, so anything higher than a 4, add 2 to it, and then you have 2 spare numbers for the min/max for the new category. Simple enough. Now we can add in the final categories, and the tree looks like it did in the example further up, but with each category given a min and max number

 1 1----------+
   |          |
 2 |          2------------+
   |          | (2)Windows |
 3 |          3------------+
   |          |
 4 |          4------------+
   |          |            |
 5 |          |            5-----------------+
   |          |            | (5)Strategy     |
 6 |          |            6-----------------+
   |          |            |
 7 |          |            7-----------------+
   |          |            |                 |
 8 |          |            |                 8-----------+
   |          | (3)Linux   |                 | (7)Action |
 9 | (1)Games |            | (6)First Person 9-----------+
   |          |            |                 |
10 |          |            |                 10----------+
   |          |            |                 | (8)RPG    |
11 |          |            |                 11----------+
   |          |            |                 |
12 |          |            12----------------+
   |          |            |
13 |          13-----------+
   |          |
14 |          14-----------+
   |          | (4)Mac     |
15 |          15-----------+
   |          |
16 16---------+

OK, so there we have it, we have a tree, showing a hierarchy, and all of the categories have numbers for a min and a max value. All numbers are unique, and all numbers belonging to a child are inside the scope of their parents. Great, but how does this help us?

Take a look. If you want a list of all categories that are children of Linux (min=4, max=13) then simply, get a list of all categories whose min is more than 4 and whose max is less than 13

Simple. In a single pass through a flat list of categories, you obtain a full list of all children. No matter how deep the tree system goes, as long as you keep to the rules of calculating the min and max values, this will always work.

It also works the other way. If say, you want a list of all parents of the RPG category (min=10,max=11), simply look for all categories whose min is less than 10 and whose max is more than 11, and you get the entire ancestry of that category. All in one pass.

This all comes in especially useful if you combine it with SQL. Complete tree segments can be obtained from a single efficient SQL query. Queries are easy to generate that can find, using the above tree as an example, all Linux games of any category!

I hope some people found this useful, and I hope that some of you had the reaction I had when someone first introduced this to me several years ago – ‘Thats very cool!’

Coding!

Yeah, I’m starting up a new blog, because I keep doing coding things that I enjoy, and since the LGP blog vanished, I haven’t been able to put them anywhere.

So, I’m importing the LGP blog stuff from archive.org, and I’m going to add new content afterwards, as it comes to me.

It’ll cover those things that don’t often have good answers online, things that are weird in code, or things that are fun in code.

That’s all, carry on.