/* Test an implicit equation I want to use for modeling 2-D blobs that surround a set of points to produce a smoothly changing shape whose area is supposed to be determined, more or less, by the sum of the weights on the points. (This turns out to be standard metaballs, in 2-D.) This could be very useful for visualizing scatterplots with widely varying densities. Specifically, the implicit function is -1 + \sum_{i=1}^n \frac{r_i^2}{(x_i - x)^2 (y_i - y)^2} which is >= 0 inside the blob and < 0 outside the blob. With two equal discs, error is zero when they're right on top of each other or sufficiently far apart. It reaches a maximum when the centers are separated by about 2.6r, at which point it's +28%. Multiple discs exactly on top of each other gives a nice low error, while I've constructed a grid of multiple identical discs 2.6r apart which gives an error over 120%. Over 120% error sounds bad, but this is actually a huge improvement over the approach I've been using so far, which sometimes has much larger errors and also wasn't very smooth at all. My original application is simulating tree growth, but it occurs to me that this is also useful for making scatterplots of points. The problem with scatterplots is dynamic density range. As long as you have many more pixels than points, and you plot each point in a single pixel, you're fine, although if it's an isolated single-pixel data point in a large background, it may still be hard to see, and so you may be tempted to make the marker bigger. You can improve the situation somewhat by rendering data points in white and the background in black, which affords you perhaps another two orders of magnitude over the traditional color scheme, and by rendering data points in grayscale, so that a pixel containing two data points looks different from a pixel containing one; this gives you an additional order of magnitude, more or less. But that still only gives you about two orders of magnitude on a computer monitor (remember the limitations of the gamma curve), and in many cases you'd like to have four or more. For example, population densities in the US range from 0.9/km² in Niobrara County, Wyoming, to 27000/km² in Manhattan. You could quite reasonably make a 10-megapixel scatterplot of US population density, one pixel per square kilometer, and it would be nice to be able to see the difference between the 8.4 million people in the 786 km² of New York City and the 636000 in the 232 km² of Boston, without losing the ability to tell where the population is in rural Wyoming. But if the 59 pixels of Manhattan at 27000 are at 100% brightness, then anything less than 270/km² will be under 1% brightness, and so probably indistinguishable from black. If, instead, 3 people per km² is 1% brightness, you can see where everybody in Wyoming lives, but as soon as you get over 300 people per km² (Hartford, South Dakota: 2534 people in 5.9 km², 431/km²) you run into a dynamic range ceiling that could make it impossible to distinguish between the density of Hartford, SD, and Manhattan, which is almost a hundred times denser. If, however, you let the 100% brightness areas spread to cover more of the map, like the tumors they are, then you can make the total brightness associated with a city accurately reflect its total population, at the cost of some spatial imprecision. At a maximum of 300 people per pixel, Manhattan would swell to fill 5420 pixels, some 41 pixels in radius, instead of the 88 pixels that it would cover physically. New York City as a whole would become a blob of 28000 pixels of pure white, extending some 94 kilometers in every direction, from Princeton to Bridgeport. But how do you spread the brightness out? Years ago, with my "char-pair scatterplots" post to kragen-hacks, I did it with a random walk: if a pixel that was to be set [to black; I was ignorant] was already set, the plotter would wander around to adjacent pixels until it found one that wasn't already full. Instead, you can do it by constructing an implicit function that defines a shape as a level set of a function composed of a sum of basis functions. This code on my netbook can only test about 12 million disc-pixel intersections per second, which is not even close to enough, but you could do much better with a smarter evaluation strategy ­ using interval arithmetic, spline approximations, and so on. */ #include #include #include static inline void compute_areas(int nn, double xx[], double yy[], double rr[]) { // Compute a fucking bounding box, and also a minimum radius. double xmin = xx[0] - 2*nn*rr[0], xmax = xx[0] + 2*nn*rr[0], ymin = yy[0] - 2*nn*rr[0], ymax = yy[0] + 2*nn*rr[0], rmin = rr[0]; // printf("disc at (%.2lg, %.2lg) radius %.2lg\n", xx[0], yy[0], rr[0]); for (int ii = 1; ii < nn; ii++) { // printf("disc at (%.2lg, %.2lg) radius %.2lg\n", xx[ii], yy[ii], rr[ii]); double nxmin = xx[ii] - 2*nn*rr[ii], nxmax = xx[ii] + 2*nn*rr[ii], nymin = yy[ii] - 2*nn*rr[ii], nymax = yy[ii] + 2*nn*rr[ii]; if (nxmin < xmin) xmin = nxmin; if (nxmax > xmax) xmax = nxmax; if (nymin < ymin) ymin = nymin; if (nymax > ymax) ymax = nymax; if (rr[ii] < rmin) rmin = rr[ii]; } // Now we choose a sampling grid to use to sample the implicit blob. // I choose a spacing of one tenth of the minimum radius, figuring // that this should be sufficient to reasonably approximate the // shapes of the shapes. double ds = rmin/10; int ww = (int)ceil((xmax - xmin)/ds), hh = (int)ceil((ymax - ymin)/ds); printf("Sampling %d×%d pixels of size %.2lg ", ww, hh, ds); printf("over x ∈ [%.2lg, %.2lg], y ∈ [%.2lg, %.2lg] ", xmin, xmax, ymin, ymax); printf("because smallest radius is %.2lg...\n", rmin); // Now we loop over the pixels and evaluate the implicit equation to // find out which ones are inside the blob, and count them. int pixels_inside = 0; for (int ii = 0; ii < ww; ii++) { for (int jj = 0; jj < hh; jj++) { double implicit_function_sum = 0; for (int kk = 0; kk < nn; kk++) { double dx = xx[kk] - (ii*ds + xmin), dy = yy[kk] - (jj*ds + ymin), rsq = dx*dx + dy*dy; // If we are exactly hitting the center of a disc, we want to // add a number sufficient to make this point "inside". implicit_function_sum += (rsq == 0.0) ? 2 : rr[kk]*rr[kk] / rsq; } if (implicit_function_sum >= 1.0) pixels_inside++; } } // Each pixel is ds × ds. double total_area = pixels_inside * ds * ds; double theoretical_area = 0; for (int ii = 0; ii < nn; ii++) theoretical_area += M_PI * rr[ii] * rr[ii]; printf("Total sampled area of implicit blob was %f\n", total_area); printf("Theoretical correct value would be %f\n", theoretical_area); printf("Error is %.2f%%\n", 100 * (total_area - theoretical_area) / theoretical_area); } static inline int usage(char *argv0, char *reason) { if (reason) fprintf(stderr, "Failure on '%s'\n", reason); fprintf(stderr, "Usage: %s x0 y0 r0 [x1 y1 r1 ...]\n", argv0); return 1; } int main(int argc, char **argv) { int ndiscs = (argc-1)/3; if (ndiscs <= 0 || ndiscs*3 + 1 != argc) { printf("%d %d\n", ndiscs, argc); return usage(argv[0], NULL); } double xx[ndiscs], yy[ndiscs], rr[ndiscs]; char *endptr; for (int ii = 1, jj = 0; ii < argc; ii += 3, jj++) { xx[jj] = strtod(argv[ii], &endptr); if (*endptr) return usage(argv[0], argv[ii]); yy[jj] = strtod(argv[ii+1], &endptr); if (*endptr) return usage(argv[0], argv[ii+1]); rr[jj] = strtod(argv[ii+2], &endptr); if (*endptr) return usage(argv[0], argv[ii+2]); } compute_areas(ndiscs, xx, yy, rr); return 0; }