/* Test an implicit equation I want to use for modeling 2-D blobs that
surround a set of points to produce a smoothly changing shape whose
area is supposed to be determined, more or less, by the sum of the
weights on the points.
(This turns out to be standard metaballs, in 2-D.)
This could be very useful for visualizing scatterplots with widely
varying densities.
Specifically, the implicit function is
-1 + \sum_{i=1}^n \frac{r_i^2}{(x_i - x)^2 (y_i - y)^2}
which is >= 0 inside the blob and < 0 outside the blob.
With two equal discs, error is zero when they're right on top of
each other or sufficiently far apart. It reaches a maximum when
the centers are separated by about 2.6r, at which point it's +28%.
Multiple discs exactly on top of each other gives a nice low error,
while I've constructed a grid of multiple identical discs 2.6r apart
which gives an error over 120%.
Over 120% error sounds bad, but this is actually a huge improvement
over the approach I've been using so far, which sometimes has much
larger errors and also wasn't very smooth at all. My original
application is simulating tree growth, but it occurs to me that this
is also useful for making scatterplots of points.
The problem with scatterplots is dynamic density range.
As long as you have many more pixels than points, and you plot each
point in a single pixel, you're fine, although if it's an isolated
single-pixel data point in a large background, it may still be hard to
see, and so you may be tempted to make the marker bigger. You can
improve the situation somewhat by rendering data points in white and
the background in black, which affords you perhaps another two orders
of magnitude over the traditional color scheme, and by rendering data
points in grayscale, so that a pixel containing two data points looks
different from a pixel containing one; this gives you an additional
order of magnitude, more or less.
But that still only gives you about two orders of magnitude on a
computer monitor (remember the limitations of the gamma curve), and in
many cases you'd like to have four or more.
For example, population densities in the US range from 0.9/km² in
Niobrara County, Wyoming, to 27000/km² in Manhattan. You could quite
reasonably make a 10-megapixel scatterplot of US population density,
one pixel per square kilometer, and it would be nice to be able to see
the difference between the 8.4 million people in the 786 km² of New
York City and the 636000 in the 232 km² of Boston, without losing the
ability to tell where the population is in rural Wyoming. But if the
59 pixels of Manhattan at 27000 are at 100% brightness, then anything
less than 270/km² will be under 1% brightness, and so probably
indistinguishable from black.
If, instead, 3 people per km² is 1% brightness, you can see where
everybody in Wyoming lives, but as soon as you get over 300 people per
km² (Hartford, South Dakota: 2534 people in 5.9 km², 431/km²) you run
into a dynamic range ceiling that could make it impossible to
distinguish between the density of Hartford, SD, and Manhattan, which
is almost a hundred times denser.
If, however, you let the 100% brightness areas spread to cover more of
the map, like the tumors they are, then you can make the total
brightness associated with a city accurately reflect its total
population, at the cost of some spatial imprecision. At a maximum of
300 people per pixel, Manhattan would swell to fill 5420 pixels,
some 41 pixels in radius, instead of the 88 pixels that it would cover
physically. New York City as a whole would become a blob of 28000
pixels of pure white, extending some 94 kilometers in every
direction, from Princeton to Bridgeport.
But how do you spread the brightness out? Years ago, with my
"char-pair scatterplots" post to kragen-hacks, I did it with a random
walk: if a pixel that was to be set [to black; I was ignorant] was
already set, the plotter would wander around to adjacent pixels until
it found one that wasn't already full. Instead, you can do it by
constructing an implicit function that defines a shape as a level set
of a function composed of a sum of basis functions.
This code on my netbook can only test about 12 million disc-pixel
intersections per second, which is not even close to enough, but you
could do much better with a smarter evaluation strategy using
interval arithmetic, spline approximations, and so on.
*/
#include
#include
#include
static inline void
compute_areas(int nn, double xx[], double yy[], double rr[])
{
// Compute a fucking bounding box, and also a minimum radius.
double xmin = xx[0] - 2*nn*rr[0], xmax = xx[0] + 2*nn*rr[0],
ymin = yy[0] - 2*nn*rr[0], ymax = yy[0] + 2*nn*rr[0],
rmin = rr[0];
// printf("disc at (%.2lg, %.2lg) radius %.2lg\n", xx[0], yy[0], rr[0]);
for (int ii = 1; ii < nn; ii++) {
// printf("disc at (%.2lg, %.2lg) radius %.2lg\n", xx[ii], yy[ii], rr[ii]);
double nxmin = xx[ii] - 2*nn*rr[ii], nxmax = xx[ii] + 2*nn*rr[ii],
nymin = yy[ii] - 2*nn*rr[ii], nymax = yy[ii] + 2*nn*rr[ii];
if (nxmin < xmin) xmin = nxmin;
if (nxmax > xmax) xmax = nxmax;
if (nymin < ymin) ymin = nymin;
if (nymax > ymax) ymax = nymax;
if (rr[ii] < rmin) rmin = rr[ii];
}
// Now we choose a sampling grid to use to sample the implicit blob.
// I choose a spacing of one tenth of the minimum radius, figuring
// that this should be sufficient to reasonably approximate the
// shapes of the shapes.
double ds = rmin/10;
int ww = (int)ceil((xmax - xmin)/ds),
hh = (int)ceil((ymax - ymin)/ds);
printf("Sampling %d×%d pixels of size %.2lg ", ww, hh, ds);
printf("over x ∈ [%.2lg, %.2lg], y ∈ [%.2lg, %.2lg] ", xmin, xmax, ymin, ymax);
printf("because smallest radius is %.2lg...\n", rmin);
// Now we loop over the pixels and evaluate the implicit equation to
// find out which ones are inside the blob, and count them.
int pixels_inside = 0;
for (int ii = 0; ii < ww; ii++) {
for (int jj = 0; jj < hh; jj++) {
double implicit_function_sum = 0;
for (int kk = 0; kk < nn; kk++) {
double
dx = xx[kk] - (ii*ds + xmin),
dy = yy[kk] - (jj*ds + ymin),
rsq = dx*dx + dy*dy;
// If we are exactly hitting the center of a disc, we want to
// add a number sufficient to make this point "inside".
implicit_function_sum += (rsq == 0.0) ? 2 : rr[kk]*rr[kk] / rsq;
}
if (implicit_function_sum >= 1.0) pixels_inside++;
}
}
// Each pixel is ds × ds.
double total_area = pixels_inside * ds * ds;
double theoretical_area = 0;
for (int ii = 0; ii < nn; ii++) theoretical_area += M_PI * rr[ii] * rr[ii];
printf("Total sampled area of implicit blob was %f\n", total_area);
printf("Theoretical correct value would be %f\n", theoretical_area);
printf("Error is %.2f%%\n",
100 * (total_area - theoretical_area) / theoretical_area);
}
static inline int
usage(char *argv0, char *reason)
{
if (reason) fprintf(stderr, "Failure on '%s'\n", reason);
fprintf(stderr, "Usage: %s x0 y0 r0 [x1 y1 r1 ...]\n", argv0);
return 1;
}
int
main(int argc, char **argv)
{
int ndiscs = (argc-1)/3;
if (ndiscs <= 0 || ndiscs*3 + 1 != argc) {
printf("%d %d\n", ndiscs, argc);
return usage(argv[0], NULL);
}
double xx[ndiscs], yy[ndiscs], rr[ndiscs];
char *endptr;
for (int ii = 1, jj = 0; ii < argc; ii += 3, jj++) {
xx[jj] = strtod(argv[ii], &endptr);
if (*endptr) return usage(argv[0], argv[ii]);
yy[jj] = strtod(argv[ii+1], &endptr);
if (*endptr) return usage(argv[0], argv[ii+1]);
rr[jj] = strtod(argv[ii+2], &endptr);
if (*endptr) return usage(argv[0], argv[ii+2]);
}
compute_areas(ndiscs, xx, yy, rr);
return 0;
}