Failing to upgrade to CryWrap

So, in March, we upgraded our machine from Sarge to Etch. We had been using sslwrap in Sarge, but sslwrap doesn't exist in Etch. According to Jonathan McDowell, the guy who used to maintain the Debian sslwrap package:

sslwrap (2.0.6-18) unstable; urgency=low

 * Users might like to consider switching away from sslwrap to
   crywrap or investigating whether more recent versions of the
   services they're sslwrapping are themselves now ssl enabled. It
   is envisaged that at some point in the future I will request
   removal of sslwrap from the archive, though I hope to
   investigate the possibility of a smooth upgrade path to crywrap
   before that happens. sslwrap is effectively dead upstream and I
   think it's probably better to consider the existing
   alternatives that can perform the same function than continue
   to work on sslwrap long term.

 -- Jonathan McDowell <noodles@earth.li> Sat, 13 Aug 2005 13:01:06
    +0100

(from http://ubuntu2.cica.es/ubuntu/ubuntu/pool/universe/s/sslwrap/sslwrap_2.0.6-18.diff.gz)

See also http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=374521, where the maintainer requested its removal from Debian.

Why I Tried, Then Gave up on CryWrap

apt-cache search sslwrap found only crywrap, and apt-cache show crywrap said:

CryWrap is intended to be a drop-in replacement for sslwrap.

This is more or less a blatant lie. CryWrap's command-line options have nothing in common with sslwrap's, and sslwrap is written to run from inetd --- for example, it reports its errors through syslogd, not by printing them to standard error. CryWrap is not, even though it has an --inetd option.

Here are the problems I encountered trying to use CryWrap:

  1. CryWrap doesn't support sslwrap command-line options.
  2. CryWrap reports its errors to stderr.
  3. When I did get CryWrap to work, it reliably took about 110 seconds to negotiate an SSL connection, which is longer than Thunderbird is willing to wait.
  4. CryWrap's sslwrap wrapper doesn't support inetd.
  5. CryWrap's documented -v flag doesn't work as documented.

I wasted an hour trying to get CryWrap to work, and eventually gave up and installed stunnel4 instead. Here are some details, in case you find yourself in a similar situation:

CryWrap doesn't support sslwrap command-line options.

The line in our /etc/inetd.conf for sslwrap looked like this, all on one line:

pop3s stream  tcp nowait root /usr/sbin/tcpd 
    /usr/sbin/sslwrap -cert /etc/sslwrap/server.pem 
                      -addr 127.0.0.1 
                      -port 110

I apt-get installed crywrap, changed sslwrap to crywrap, and hoped for the best. Initially, that failed because tcpd was specially configured to allow connections to sslwrap from weird places, in /etc/hosts.deny:

ALL EXCEPT sslwrap: PARANOID EXCEPT <censored>

Our Argentine ISP, TeleCentro, is so incompetent that our reverse DNS maps our IP address to a name that doesn't exist. So tcp_wrappers's PARANOID rule won't allow us to connect to tcp-wrapped services. So I changed that to say

ALL EXCEPT crywrap: PARANOID EXCEPT <censored>

Then I ran into the problem that we had removed the /etc/sslwrap directory and the server.pem file inside it that contained the server's private key. After a little bit of digging, I found out where to get the private key file, stuck it in /etc/crywrap/server.pem, and put the following completely wrong line in /etc/inetd.conf:

pop3s stream  tcp nowait root /usr/sbin/tcpd 
    /usr/sbin/crywrap -cert /etc/crywrap/server.pem 
                      -addr 127.0.0.1 
                      -port 110

You see, at this point, I still believed the package description that claimed that CryWrap was "a drop-in replacement". I got a log line (apparently from tcpd) that said the connection had been made:

Mar 29 19:51:03 panacea crywrap[27458]: 
    connect from 190.55.55.32 (190.55.55.32)

But the crywrap process had died, and there were no error messages in any of the /var/log files explaining why. This is because of problem #2, "CryWrap reports its errors to stderr," which I explain below. Upon consulting the man page and debugging from the error message for a while, I ended up with the following line in /etc/inetd.conf instead:

pop3s stream tcp nowait root /usr/sbin/tcpd
    /home/kragen/crywrap -d 127.0.0.1/110 -i

/etc/crywrap/server.pem is the default location for CryWrap to look for a server certificate, so I omitted it from the command line.

CryWrap reports its errors to stderr.

In order to find out what was wrong, I temporarily ran /home/kragen/crywrap instead of /usr/sbin/crywrap from inetd. /home/kragen/crywrap is this script:

#!/bin/sh
/usr/bin/strace -s4096 -o /tmp/crywrap.strace /usr/sbin/crywrap "$@"

And it turned out that CryWrap was writing its error messages to stderr, file descriptor 2, instead of to a log file. stderrn in a process run from inetd is actually connected to the socket talking to the client, so writing error messages to it is almost certain to violate the protocol expected by the client. Here's one sample error message (a result of problem #4, "CryWrap's sslwrap wrapper doesn't support inetd", below) from strace's output (wrapped for readability):

write(2, "crywrap", 7)                  = 7
write(2, ":", 1)                        = 1
write(2, " ", 1)                        = 1
write(2, "Could not resolve address: `/\'", 30) = 30
write(2, "\n", 1)                       = 1
write(2, "Try `crywrap --help\' or `crywrap --usage\' for 
    more information.\n", 64) = 64
exit_group(64)                          = ?

This was at the very end of the file.

Now, this would not be such a heinous sin in a program that was intended to speak, say, SMTP. If a fatal error message gets sent to an SMTP client, it's likely to end up somewhere that a human being can see it and diagnose the problem. But SSL is a different matter. SSL connections are normally full of random toxic binary data, so almost no SSL-speaking programs will dump out that data on a human when there's a connection failure. So the only way I was able to find these error messages was by running the program under strace(1).

CryWrap took about 110 seconds to negotiate an SSL connection.

Once I got CryWrap to run, my wife Beatrice was still reporting failures getting her mail in Thunderbird. strace showed that CryWrap was running and receiving data (less /tmp/crywrap.strace and then typing >F was very helpful to watch this in real time), but it was receiving it very slowly, a few bytes every few seconds.

At Paul Visscher's suggestion, I tested the connection myself with the OpenSSL package's openssl command:

openssl s_client -connect panacea.canonical.org:pop3s

This did eventually connect and allow me to speak POP (simulated copy-and-paste here may contain errors):

...
    Timeout   : 300 (sec)
    Verify return code: 21 (unable to verify the first certificate)
---
+OK
USER imaptest
+OK
PASS <censored>
+OK
QUIT
DONE

However, it took about a minute and 51 seconds. This is apparently more than Thunderbird's timeout. I don't know enough about SSL to know why this might be. CryWrap reported it with these syslog messages (wrapped and trimmed for readability):

crywrap[27830]: Accepted connection from 190.55.55.32 on 0 to 
    127.0.0.1/110
crywrap[27830]: Handshake failed: A TLS packet with unexpected 
    length was received.

I never did figure out why this happened, and so I gave up on CryWrap and switched to stunnel4 (see below).

CryWrap's sslwrap wrapper doesn't support inetd

There is a shell script in /usr/share/crywrap/sslwrap that intends to make crywrap act like sslwrap, but it doesn't consider the case of trying to run from inetd (-i or --inetd) flag. Because it's a badly-written shell script, it doesn't notice that its "listen port" parameter is missing; it merely tries to invoke CryWrap with "-l /" (CryWrap uses a slash to separate IP address from port, instead of the traditional colon; in this case, both the IP address and the port are missing, leaving only the lonesome "/", like a girl who's been stood up on a date.

CryWrap reports this by sending the helpful message:

crywrap: Could not resolve address: `/'

to the would-be SSL client. I extracted it from an strace output file in /tmp, except that I had to use strace -ff to follow the children of the /usr/share/crywrap/sslwrap script. (I guess I could have just redirected stderr to a file instead of using strace.)

CryWrap's documented -v flag doesn't work as documented.

CryWrap's man page documents a -v flag. -v 0 is documented to turn off client certificate validation, although having it turned off is documented to be the default. We thought that perhaps the default was actually something other than what it was documented to be, because on the successful openssl s_client connections (see above under #3), we were getting this message:

crywrap[28190]: Error getting certificate from client: The peer
    did not send any certificate.

And it seemed plausible that this might explain the slowness (#3). So I tried adding -v 0 to the command line, because the man page says:

--verify (-v) [LEVEL]

Set the level of client certificate verification. Level one simply logs the result, level two and above abort if the certificate could not be verified.
Default is 0.

If you actually try running crywrap with -v 0, you get this error message:

kragen@panacea:~$ /usr/sbin/crywrap -l /3802 -d /110 -v 0
crywrap: Too many arguments
Try `crywrap --help' or `crywrap --usage' for more information.

Except that I didn't originally get the error message at the command line; I had to dig it out of strace output in /tmp after editing /etc/inetd.conf and restarting inetd. It turns out that -v0 is the supported syntax, despite what the man page says, and in violation of the usual Unix conventions. No space is permitted.

Success with stunnel4

I did this:

$ sudo apt-get install stunnel4

Then, after skimming the stunnel man page, I stuck this in /etc/inetd.conf (all on one line) in place of the crywrap line:

pop3s stream tcp nowait root /usr/sbin/tcpd 
    /usr/bin/stunnel -p /etc/crywrap/server.pem -r 110

That worked. Then I moved /etc/crywrap/server.pem to /etc/stunnel/server.pem and all was good. The total elapsed time since giving up on CryWrap was just under eleven minutes.

Things I Learned

Or was reminded of.

  1. It's easy to underestimate how much of a pain in the ass your software will be for other people. Presumably CryWrap's author wouldn't have had any of the above problems (except for #3, and he could have probably diagnosed that one).

  2. If I write software and claim it's a "drop-in replacement" for something else, someone is going to be sad. Or pissed off. Because I'll probably forget something. (Although hopefully I'll do better than this!)

  3. It's good to be careful about where error messages go.

  4. I should try to make sure that my software handles errors (e.g. missing listen port) in a graceful fashion, i.e. by bombing out with an error ("listen port required") instead of proceeding to invoke something else with some broken default (in this case, the empty string) and relying on it to emit a useful error message (crywrap: Could not resolve address: `/'). Generally it's pretty easy to make this mistake in shell scripts, but in this case the listen port was explicitly set to the empty string before command-line parsing, as a default, so the problem would have been the same regardless of language.

  5. It takes as long to write stuff like this up as it does to experience it.

  6. Violating established conventions is likely to cause some frustration; be sure you're doing it for a good reason. By convention -v 0 is equivalent to -v0 when -v takes an argument; the violation of this convention made the software harder to use.

  7. stunnel rocks and can do what sslwrap did. CryWrap sucks and can't.

  8. OpenSSL has the openssl s_client command, which is like an SSL version of netcat, and also openssl s_server. These should be very handy for troubleshooting SSL stuff in general.

  9. I'm not a great sysadmin, and I tend to be too persistent when I should give up and try something else a little sooner.

Credits

Thanks to Gergely Nagy for writing CryWrap, Jonathan McDowell for maintaining the sslwrap Debian package for so long, Rick Kaseguma for writing sslwrap in the first place, Beatrice Murch for having the patience to help me test the mail server after the upgrade, Paul Visscher for helping me out with most of the above stuff and also doing a bunch of the work of the Etch upgrade on our machine, and Brett Smith and Jason Cook for doing most of the rest of that work.