Explanation of Kragen's .signature puzzle

Written 1999-07-08

Here's the program:

main(int c,char**v){char a[]="ks\0Okjs!\0\0\0\0\0\0\0",*p,*t=strchr
(*++v,64),*o=a+4;int s=socket(2,2,0);*(short*)a=2;p=t;while(*p)(*p++&48)
-48?*o++=atoi(p):0;connect(s,a,16);write(s,*v,t-*v);write(s,"\n",1);while
((c=read(s,a,16))>0)write(1,a,c);}

Here's a line-by-line (or expression-by-expression) dissection of the program. (You should read how to run the program first.)

main(int c,char**v) {

This is just the standard way to begin a C program's main function, which is what runs when you run the program. The number of arguments (including the program's filename) goes into c, and v points at an array of char pointers which point to the actual arguments. So v[0] is the name of the program, and v[1] is the command-line argument given to the program.

char a[]="ks\0Okjs!\0\0\0\0\0\0\0",

This defines an array of characters and initializes it to the string above. Several things to notice about the string:

*p,*t=strchr(*++v,64),

p is an uninitialized char pointer, and the t expression does a couple of things:

*o=a+4;

o is a char * that points to the "k" of "kjs", which is the fifth byte of a.

int s=socket(2,2,0);

On Solaris, where I wrote this, this is socket(PF_INET, SOCK_STREAM, 0);, which creates a TCP socket and returns its file descriptor. On Linux, SOCK_STREAM is 1.

*(short*)a=2;

Assuming that short is two bytes, which it usually is, this sets the "ks" to 2 in whatever the native byte order is. (Originally I just had \0\2 instead of "ks", but that's not portable.)

p=t;
while(*p)
    (*p++&48)-48 ? *o++=atoi(p) : 0;

Well, this points p at where t points, which is the '@'-sign in the command-line argument; then it loops while p points at something other than a null byte.

The expression (x&48)-48 extracts two bits from x, then compares to see if they are both set, returning 0 if they are both set and nonzero (specifically -16, -32, or -48) if they aren't both set. If x is a numeric character ('0' through '9') or one of a few other characters, this will be 0. But for most characters (specifically including '@' and '.'), it will be nonzero. I sat down with an ASCII chart to try to come up with a short expression that would discriminate between these two classes of characters, and this is what I got.

So p gets incremented each iteration, and what it used to point to gets checked to see if it's '@' or '.'. If it is, then *o++ = atoi(p) gets evaluated; if it's not, 0 gets evaluated, which doesn't do anything.

Now, at the beginning, o points to the "k" of "kjs". As the loop proceeds, the characters o points to get overwritten and o gets incremented to point to later characters in the string.

When atoi(p) is evaluated, p points at the character after the one that was '@' or '.'. So if I pass the argument "@128", the condition will be true on the first character, and atoi gets called with p pointing to the 1. atoi converts ASCII to integer, and it returns 128.

So you can pass an argument like @1.2.3.4, and the numbers in your argument will get written into successive characters in the string 'a', starting at the 'k'.

connect(s,a,16);

OK, so 'a' is being passed as the second argument of connect(). That means the string 'a' must contain a struct sockaddr. Looking at the struct sockaddr definition, at least on Solaris, you can see that the first two bytes contain the "family" of the socket in native byte order; 2 means AF_INET, the Internet family, so it's really a struct sockaddr_in.

The next two bytes in a struct sockaddr_in are the port number in network byte order -- in this case, 79. The next four bytes are the IP address, in this case, whatever that loop up above put in there, which is whatever you passed on the command line in readable ASCII. The next eight bytes are supposed to be zeroes, and they are, unless you passed too many numbers on the command line.

Port 79 is the finger port.

write(s,*v,t-*v);

This writes a string to the socket, beginning at the beginning of the first command-line argument, and with t-*v characters in it. t-*v is the number of characters in the argument before the @-sign. So if the argument is kragen@3, then this will write "kragen" to the socket.

write(s,"\n",1);

This sends a newline to the socket; you have to do this before the finger server will answer you.

while ( (c=read(s,a,16)) > 0)
    write(1,a,c);
}

This reads from the socket into the buffer 'a', destroying the struct sockaddr_in we built there earlier. The read function returns the number of bytes read on success, 0 on end of file (i.e. connection closed), or -1 on error. If there's no data to read, it hangs until there is, or until there's an EOF or error.

So the program waits here until the finger server sends back some data, then reads up to 16 bytes of it. Then it writes however many bytes it got to fd 1, which is standard output. Then it goes back again and repeats the process.

Earlier versions didn't bother to check whether read returned >0, just whether or not it was 0. Thus if the connection wasn't properly set up, read would return -1, and write would interpret that as a request to write all of memory. This would fail (although on Linux 2.1 and 2.2, it would fail after writing a substantial amount of garbage). Worse, it would iterate, and on Linux, the next read from the never-opened connection would hang the program. I decided this was bad enough that I ought to add four characters of error checking. :)

So it's a finger program.

Kragen's .signature puzzle | Kragen's home page