Tuesday, December 27, 2005

Tomcat must have noticed my blog

... because it ruined my servlet redesign by crashing the server when a servlet calls `java.lang.Class.forName()`.

*Update*: Apparently tomcat is innocent and it is all my fault for causing an infinite recursion (by mapping
a single servlet to `/*` and trying to forward the request to `/WEB-INF/some-page.jsp`).
I'm a bit disappointed that a misbehaving application can bring the whole application server down so easily.

Friday, December 23, 2005

Some Java programs are just LISP in disguise as well

The easiest way to write an expression parser that creates a syntax tree from a string is of course to use a suitable parser
generator. If that is not desirable for some reason, the second choice is a recursive-descent parser, a few
functions calling each other, walking over the string and keeping a pointer to the current position:

>
Tree parse(String expr) {
> return parseOr(expr, new int[] { 0 });
> }
>
> Tree parseOr(String expr, int pos[]) {
> … parseAnd(expr, pos) … pos[0]++; …
> }


This works, but passing the parameters to each function is redundant and using `pos[0]` is ugly.

Wednesday, December 21, 2005

Why Java is not the right language for web development

The recent rise of popularity of [Ruby on Rails](http://www.rubyonrails.org/) seems to be mainly a backlash
against "the enterprise design" that is overengineered for most applications and designed a priori, "by committee"; people are drawn to Rails because the whole framework is so simple to learn and convenient,
not because they are persuaded Ruby is a better language for the application.

While I don't think Ruby is the *only* language suitable for Web programming, I feel quite confident Java
is not suitable for this purpose at all.

Wednesday, December 7, 2005

The ugliest C feature:

<tgmath.h> is a header provided by the standard C library,
introduced in C99 to allow easier porting of Fortran numerical software to C.

Fortran, unlike C, provides "intrinsic functions", which are a part of the language and behave more like
operators. While ordinary ("external") functions behave similarly to C functions with respect to types
(the types of arguments and parameters must match and the restult type is fixed), intrinsic functions accept arguments of several types and their return type may depend on the type of their arguments.
For example Fortran 77 provides, among others, an `INT` function which accepts `Integer`, `Real`, `Double` or `Complex` arguments and always returns an `Integer`,
and a `SIN` function which accepts `Real`, `Double` or `Complex`
arguments and returns a value of the same type.

This helps the programmer somewhat because the function calls don't have to be changed
if variable types change. On the other hand user-defined
functions can't behave this way, so the additional flexibility is really limited to single subroutines that don't
need to call user-defined functions.

Some C programmers would call the feature ugly from the above description already, for the same
reason integrating `printf` into the language would be ugly.

This functionality was incorporated in C99 together with other features for better support of numerical
computation and it is provided in the abovementioned <tgmath.h> header.
Provided are goniometric and logarithmic functions, functions for rounding and a few others.
The header defines macros that shadow the existing functions from <math.h>; e.g. the `cos` macro behaves like the `cos` function when its parameter has type
`double`, like `cosf` for `float`, `cosl` for `long double`, `ccos` for `double _Complex`, `ccosf` for `float _Complex`, `ccosl` for `long double _Complex`. Finally, when the parameter has any integer type, the
`cos` function is called, as if the parameter were implicitly converted to `double`.

The second reason why this feature is ugly is that it attempts to imitate functions, but the imitation is imperfect and even dangerous:
If you try to pass the generic macro `cos` as a function parameter, you actually always supply the `cos` function operating on `double`s because the macro expansion doesn't happen when `cos` is not followed
by a left parenthesis.

The final reason why this feature is ugly is that such macros can't be implemented in strictly conforming C, they have to rely on some kind of compiler support - and experience (e.g. the speed with which bugs in the
`glibc` implementation are discovered) seems to suggest this features is used very rarely and doesn't deserve
to be a part of the "core language", especially because the underlying feature is not available.
(Contrast this to <stdarg.h>, which is available for portable use.)

Now, if the feature is both ugly and not used in practice, why mention it at all? I'm writing this article because I have examined the `glibc` implementation and it is such an ingenious hack that I feel it should
be recorded for posterity, in some better way than this commit message:

>
2000-08-01  Ulrich Drepper  <drepper@redhat.com>
> Joseph S. Myers <jsm28@cam.ac.uk>
>
> * math/tgmath.h: Make standard compliant. Don't ask how.


Monday, December 5, 2005

How to destroy Linux

If you don't feel strongly about binary-only modules because "nobody gets hurt", Arjan's Linux in a binary world... a doomsday scenario might persuade you to change your mind.

Sunday, December 4, 2005

Some C programs are just LISP in disguise

After learning LISP I'm increasingly noticing open-coded emulations of LISP facilities. While every LISP advocate talks about macros, macros are not the only reason why LISP is worth learning.

Dynamic variables look like an obscure feature inherited from the times of LISP interpreters, but there is no really clean way to emulate the functionality if the language doesn't provide it. (On the other hand, the comprehensive LISP condition system can be implemented as a set of macros on top of
dynamic variables.)

Friday, December 2, 2005

Fun with C99 compound literals

Compound literals allow one to write e.g.:

struct point { int x, y; };
extern void putpixel(const struct point *);
and later

putpixel(&(const struct point){ 1, 2 });
struct point pt = (const struct point){ 37, 42 };

Nice, but not useful much, right? I was amazed at Nikita's usage:

* [Safe variadic functions](http://nikitadanilov.blogspot.com/2005/07/vaargs-c99-compound-literals-safe.html)
* [Named (and optional) formal parameters](http://nikitadanilov.blogspot.com/2005/08/named-formals.html)

strtol() Usage Guide

`strtol()` is what you get when you want to be flexible and yet have a simple interface.
At the first glance it looks like a nice, clean function: you pass it a string and a base and you get the value and optionally a pointer to the rest of the string.

Then you read the documentation.

To convert `const char *str` to a `long`, properly checking for overflow, invalid trailing characters and empty input, it is necessary to do the following:

> char *p;
errno = 0;
result = strtol(str, &p, base);
if (errno != 0 || *p != 0 || p == str)
error_handling ();

It is necessary to check both `errno` and `*p`; if you don't check `errno`, you get `0` for empty input and `LONG_MAX` or `LONG_MIN` for overflow or underflow. On empty input the return value is `0` and `errno` might be set to `EINVAL`; the portable way of checking for empty input is comparing `p` and `str`.

This will still accept strings that start with white space; check for `!isspace((unsigned char)*str)` if you want to reject them.