#!/usr/bin/python3 """Quares: quadtree for rectangle coverage. The objective here is to efficiently compute the area of a union of many axis-aligned rectangles with integer coordinates, or maybe a difference of multiple such regions. Qanvas ====== There’s an imperative convenience interface, Qanvas:: >>> q = quares.Qanvas(16) >>> q.fill(x=3, y=4, width=7, height=2, color=-2) >>> q.print() //////////////// //////////////// //////////////// //////////////// ///.......////// ///.......////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// >>> q.coverage(-1) 242 >>> q.coverage(-2) 14 Subsequent rectangles drawn with .fill can replace previously drawn pixels:: >>> q.fill(5, 2, width=3, height=10, color=-3) >>> q.print() //////////////// //////////////// /////---//////// /////---//////// ///..---..////// ///..---..////// /////---//////// /////---//////// /////---//////// /////---//////// /////---//////// /////---//////// //////////////// //////////////// //////////////// //////////////// >>> q.coverage(-2) 8 >>> q.coverage(-3) 30 Coverage of -2, drawn as '.', has decreased from 7×2 = 14 to 8. There’s a fill_masked method to replace only a given target color; in this case, we choose as our target the default background color, -1: >>> q.fill_masked(x=4, y=3, width=6, height=6, color=-5, target=-1) >>> q.print() //////////////// //////////////// /////---//////// ////+---++////// ///..---..////// ///..---..////// ////+---++////// ////+---++////// ////+---++////// /////---//////// /////---//////// /////---//////// //////////////// //////////////// //////////////// //////////////// >>> [q.coverage(color) for color in [-1, -2, -3, -4, -5]] [206, 8, 30, 0, 12] You can snapshot the Qanvas state and restore the snapshot later. This can provide undo and redo:: >>> snapshot = q.snapshot() >>> q.fill(-5, -5, 10, 10, -5) >>> q.print() +++++/////////// +++++/////////// +++++---//////// +++++---++////// +++++---..////// ///..---..////// ////+---++////// ////+---++////// ////+---++////// /////---//////// /////---//////// /////---//////// //////////////// //////////////// //////////////// //////////////// >>> q.restore(snapshot) >>> q.print() //////////////// //////////////// /////---//////// ////+---++////// ///..---..////// ///..---..////// ////+---++////// ////+---++////// ////+---++////// /////---//////// /////---//////// /////---//////// //////////////// //////////////// //////////////// //////////////// Not yet implemented: to get the pixel data for other than debugging purposes, there's a .tilemaq() method which takes a power-of-2 tile-size argument and returns a Tilemaq object. This has two methods: .map() to return a two-dimensional array of tile indices, and .tile(index) to return a two-dimensional array of pixels for the given tile index. The pixels for a given tile index are guaranteed to never change; new pixel patterns will be assigned new tile indices. The idea is that if you have, say, a 1024×1024 qanvas that you’re incrementally updating, you can divide it into, say, 32×32 tiles, 1024 of them in all. You can use a slow scripting language like GDScript to sling 1024 tiles up onto the display with no problem at all, as long as you can call some fast native-code thing to actually put the tile there. And populating the 1024 pixels of a previously unknown tile is also something you can reasonably do from GDScript. Performance =========== I tried building a pathologically bad case, a 2048×2048 canvas with 2048 rectangles on it forming a crisscross pattern:: In [1]: import quares In [4]: q = quares.Qanvas(2048) In [5]: %%time ...: for i in range(0, 2048, 2): ...: q.fill(i, 0, 1, 2048, -2) ...: q.fill(0, i, 2048, 1, -3) ...: CPU times: user 10.1 s, sys: 3.36 ms, total: 10.1 s Wall time: 10.1 s This took an average of 5 milliseconds to add each of these rectangles. (But see below; this is from an old version of the code.) The leftover parts of the background are in one million isolated pixels, which we can visit in a quarter millisecond:: In [6]: %%time ...: q.coverage(-1) ...: ...: CPU times: user 249 µs, sys: 2 µs, total: 251 µs Wall time: 268 µs Out[6]: 1048576 The remainder are split almost evenly between the horizontal and vertical rectangle colors:: In [7]: %%time ...: q.coverage(-2) ...: ...: CPU times: user 342 µs, sys: 0 ns, total: 342 µs Wall time: 356 µs Out[7]: 1572352 In [8]: %%time ...: q.coverage(-3) ...: ...: CPU times: user 160 µs, sys: 0 ns, total: 160 µs Wall time: 171 µs Out[8]: 1573376 Taking a snapshot is instant. We can also instantly fill the whole grid with a new color:: In [9]: s = q.snapshot() In [10]: %%time ...: q.fill(0, 0, 2048, 2048, -4) ...: ...: CPU times: user 14 µs, sys: 0 ns, total: 14 µs Wall time: 21.9 µs In [11]: %%time ...: q.coverage(-4) ...: ...: CPU times: user 15 µs, sys: 0 ns, total: 15 µs Wall time: 22.4 µs Out[11]: 4194304 Restoring the snapshot is also instant:: In [12]: %%time ...: q.restore(s) ...: ...: CPU times: user 8 µs, sys: 0 ns, total: 8 µs Wall time: 13.8 µs In [13]: %%time ...: q.coverage(-4) ...: ...: CPU times: user 214 µs, sys: 2 µs, total: 216 µs Wall time: 223 µs Out[13]: 0 In [14]: %%time ...: q.coverage(-2) ...: ...: CPU times: user 14 µs, sys: 0 ns, total: 14 µs Wall time: 19.6 µs Out[14]: 1572352 If instead of fill we use fill_masked, to create our million background tiles, it takes 4590 milliseconds instead of 0.02 milliseconds:: In [15]: %%time ...: q.fill_masked(0, 0, 2048, 2048, -4, target=-1) ...: ...: CPU times: user 4.59 s, sys: 0 ns, total: 4.59 s Wall time: 4.59 s On one hand, that’s only 4.6 microseconds per pixel set. On the other hand, it’s too slow to do it within a single 16-millisecond 60-hertz frame by almost three orders of magnitude. So I set to work optimizing, and now all those operations are fast. Here’s the results from the current version from the crude_performance_test below:: instantiation 0.015 ms nop 0.080 ms nop 0.018 ms filling with 2048 skinny lines 137.012 ms 1048576 counting coverage of -1 0.126 ms 1572352 counting coverage of -2, vertical 0.071 ms 1573376 counting coverage of -3, horizontal 0.063 ms taking snapshot 0.011 ms filling with -4 0.010 ms 4194304 counting coverage of -4 0.024 ms restoring snapshot 0.013 ms 0 counting coverage of -4 again 0.070 ms 1572352 counting coverage of -2 again 0.029 ms filling_masked with -4 0.145 ms 1048576 counting coverage of -4 again 0.076 ms nop 0.011 ms nop 0.019 ms total entries 8205 real 0m0.184s user 0m0.172s sys 0m0.012s This means that drawing the 2048 lines has gone from 10 seconds to 137 milliseconds, a speedup of 70×, and the fill_masked operation has gone from 4.6s to 145μs, a speedup of 32000×. However, this is unrealistically good, because my optimizations have transformed what used to be a worst case into a best case. I suspect that realistically the speedup in the new worst case is more like 10×. You’ll note that the nops in the above performance test were 11 to 80 microseconds, which means that the coverage counting results in the crisscross pattern were barely above measurement error, as was the fill_masked operation. Here are the only three lines that were outside that range: filling with 2048 skinny lines 137.012 ms counting coverage of -1 0.126 ms filling_masked with -4 0.145 ms I added an even cruder test where I fill the qanvas with a more random pattern, which makes fill_masked to take on the order of a quarter second still. That suggests that maybe the only way to use this in practice in a game is to spread the fill computation over multiple screen frames. Here’s one run:: total entries 8205 erasing again 0.015 ms 2062 random lines drawn drawing random lines 1406.313 ms fill_masked on random grid 242.622 ms Quadtree ======== The Qanvas is implemented using a Quadtree, which provides a much less stateful interface, relying on the caller to maintain most of the state. The units of the quadtree are called “quares”, for “quadtree squares”. A quare is a 2ⁿ × 2ⁿ axis-aligned square whose coordinates start at a multiple of 2ⁿ. Each quare is identified by a quare id number, or “qid”. Negative numbers are leafnodes representing pixel colors, while the quares identified by nonnegative numbers are four-way branching nodes that form the roots of immutable quadtrees. Because all quares are immutable, you can do a computation on them (such as how they should look on screen, or how many pixels of a certain color they contain) and safely memoize its results. Hash consing is used to ensure that (within a single Quadtree object) duplicate copies of the same immutable quadtree node are not stored. This both makes it relatively space-efficient and enables such memoized results to be applied broadly. Node sizes (e.g., 256×256) are not stored, enabling leafnodes to be applied to large areas. This could have surprising results if you aren’t consistent about keeping track of those node sizes. Initially you can use any negative number as a qid to represent an arbitrary-sized area of that color. For example, to draw a 5×4 rectangle starting at (6, 2), in color -2, on a background that’s initially of color -1, of size 16×16, we can do this:: >>> t = quares.Quadtree() >>> qid = t.update_rect(((6, 2), (5, 4)), val=-2, size=16, qid=-1) >>> quares.ascii_draw(t, qid, 16) //////////////// //////////////// //////.....///// //////.....///// //////.....///// //////.....///// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// >>> qid 633 Redrawing the same rect, or parts of it, in the same -2 color, doesn’t change the resulting qid:: >>> qid = t.update_rect(((6, 2), (5, 4)), val=-2, size=16, qid=qid) >>> qid 633 >>> qid = t.update_rect(((6, 2), (4, 3)), val=-2, size=16, qid=qid) >>> qid 633 Because of hash consing, (qid, size) pairs are the same whenever the pattern of pixels they identify are the same, and different whenever the patterns of pixels are different. If we make an actual change, here drawing a 7×2 rectangle of -3 at (2, 5), we get a new qid:: >>> qid = t.update_rect(((2, 5), (7, 2)), val=-3, size=16, qid=qid) >>> qid 762 >>> quares.ascii_draw(t, qid, 16) //////////////// //////////////// //////.....///// //////.....///// //////.....///// //-------..///// //-------/////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// But the previous version is still stored:: >>> quares.ascii_draw(t, 633, 16) //////////////// //////////////// //////.....///// //////.....///// //////.....///// //////.....///// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// In fact, if we erase the rectangle by redrawing it in the background color, and then redrawing the overlap in color -2, we get back the qid for that previous version:: >>> qid = t.update_rect(((2, 5), (7, 2)), val=-1, size=16, qid=qid) >>> quares.ascii_draw(t, qid, 16) //////////////// //////////////// //////.....///// //////.....///// //////.....///// /////////..///// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// >>> qid = t.update_rect(((6, 5), (3, 1)), val=-2, size=16, qid=qid) >>> quares.ascii_draw(t, qid, 16) //////////////// //////////////// //////.....///// //////.....///// //////.....///// //////.....///// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// //////////////// >>> qid 633 At this point there are a total of 26 items stored in the Quadtree’s hash table, representing all of the above drawings, their intermediate versions, and their subparts. We can get the parts of any quare with get_quare:: >>> t.get_quare(762) (778, 856, -1, -1) These are in the order northeast, northwest, southeast, southwest, assuming Y-coordinates increase downward. Here we can see that the southern half of 762 is solid -1. Its northeast quarter, 778, looks like this:: >>> quares.ascii_draw(t, 778, 8) //////// //////// //////.. //////.. //////.. //------ //------ //////// There’s a class that memoizes how many pixels are of a given color in order to compute it efficiently:: >>> qid = 762 >>> c = quares.CoverageComputer(t) >>> c.coverage(qid, 16, -1) 225 >>> c.coverage(qid, 16, -2) 17 >>> c.coverage(qid, 16, -3) 14 >>> 225+17+14 256 This visits each of the 14 internal nodes in the quadtree exactly once for each target pixel value. There’s an update_rect_masked method which only updates pixels that previously had a given pixel value, so in effect the rectangle is drawn underneath existing rectangles rather than on top of them:: >>> qid = t.update_rect_masked(((4, 4), (6, 7)), -4, 16, qid, -1) >>> quares.ascii_draw(t, qid, 16) //////////////// //////////////// //////.....///// //////.....///// ////,,.....///// //-------..///// //-------,////// ////,,,,,,////// ////,,,,,,////// ////,,,,,,////// ////,,,,,,////// //////////////// //////////////// //////////////// //////////////// //////////////// However, it requires worst-case O(N² log N) time rather than the O(N log N) required by update_rect. """ class Quadtree: def __init__(self): self.log = [] self.index = {} def total_entries(self): return len(self.log) def get_quare(self, qid): if qid < 0: # Negative quares are leafnodes: a solid color return qid, qid, qid, qid return self.log[qid] def put_quare(self, quare): # Check to see if we can coalesce a leafnode. nw, ne, sw, se = quare if nw < 0 and nw == ne == sw == se: return nw qid = self.index.get(quare) if qid is None: qid = len(self.log) self.index[quare] = qid self.log.append(quare) return qid def update_rect(self, rect, val, size, qid, table=None): """Return a new qid for the quare qid modified by setting rect to val. rect is ((start_x, start_y), (width, height)). Val should normally be negative to indicate a leafnode. qid is interpreted as a size×size square starting at (0, 0) and may be negative to indicate a leafnode or may be a qid returned from update_rect or put_quare. size should be a power of 2. """ # Three cases: inside rect, outside rect, overlapping rect. # The first two cases are easily handled; the third requires # recursing on children. (start_x, start_y), (width, height) = rect # Is the whole size×size square outside the rect? if (start_x <= -width or start_y <= -height or start_x >= size or start_y >= size): return qid # Or is the whole size×size square *inside* this rect? if (start_x <= 0 and start_y <= 0 and start_x + width >= size and start_y + height >= size): return val # If not, we must recurse on the children. # Attempt to memoize. First, though, clip the rect to the # sizexsize+0+0 bounds of this quare: if start_x < 0: width += start_x start_x = 0 if start_y < 0: height += start_y start_y = 0 width = min(size, width) height = min(size, height) if table is None: table = {} memo_key = qid, size, start_x, start_y, width, height memo_value = table.get(memo_key) if memo_value is not None: return memo_value nw_kid, ne_kid, sw_kid, se_kid = self.get_quare(qid) half_size = size // 2 new_nw_kid = self.update_rect(((start_x, start_y), (width, height)), val, half_size, nw_kid, table) new_ne_kid = self.update_rect(((start_x - half_size, start_y), (width, height)), val, half_size, ne_kid, table) new_sw_kid = self.update_rect(((start_x, start_y - half_size), (width, height)), val, half_size, sw_kid, table) new_se_kid = self.update_rect(((start_x - half_size, start_y - half_size), (width, height)), val, half_size, se_kid, table) result = self.put_quare((new_nw_kid, new_ne_kid, new_sw_kid, new_se_kid)) table[memo_key] = result return result def update_rect_masked(self, rect, val, size, qid, target, table=None): """Like update_rect, but only changes pixels of value target.""" (start_x, start_y), (width, height) = rect if (start_x <= -width or start_y <= -height or start_x >= size or start_y >= size): return qid # In this case, if the quare is entirely inside the rect, we # only avoid recursing in the case where it’s a leafnode. if (qid < 0 and start_x <= 0 and start_y <= 0 and start_x + width >= size and start_y + height >= size): return val if qid == target else qid # Attempt to memoize. First, though, clip the rect to the # sizexsize+0+0 bounds of this quare: if start_x < 0: width += start_x start_x = 0 if start_y < 0: height += start_y start_y = 0 width = min(size, width) height = min(size, height) if table is None: table = {} memo_key = qid, size, start_x, start_y, width, height memo_value = table.get(memo_key) if memo_value is not None: return memo_value nw_kid, ne_kid, sw_kid, se_kid = self.get_quare(qid) half_size = size // 2 new_nw_kid = self.update_rect_masked(((start_x, start_y), (width, height)), val, half_size, nw_kid, target, table) new_ne_kid = self.update_rect_masked(((start_x - half_size, start_y), (width, height)), val, half_size, ne_kid, target, table) new_sw_kid = self.update_rect_masked(((start_x, start_y - half_size), (width, height)), val, half_size, sw_kid, target, table) new_se_kid = self.update_rect_masked(((start_x - half_size, start_y - half_size), (width, height)), val, half_size, se_kid, target, table) result = self.put_quare((new_nw_kid, new_ne_kid, new_sw_kid, new_se_kid)) table[memo_key] = result return result def ascii_draw(quadtree, qid, size): buf = [[0] * size for i in range(size)] _update_buf(buf, quadtree, qid, size, (0, 0)) for row in buf: print(''.join(chr(32 + (item & 0xf)) for item in row)) def _update_buf(buf, quadtree, qid, size, origin): x, y = origin if size == 1: buf[y][x] = qid else: nw_kid, ne_kid, sw_kid, se_kid = quadtree.get_quare(qid) half_size = size // 2 _update_buf(buf, quadtree, nw_kid, half_size, (x, y)) _update_buf(buf, quadtree, ne_kid, half_size, (x + half_size, y)) _update_buf(buf, quadtree, sw_kid, half_size, (x, y + half_size)) _update_buf(buf, quadtree, se_kid, half_size, (x + half_size, y + half_size)) class CoverageComputer: def __init__(self, tree): self.tree = tree self.cache = {} def coverage(self, qid, size, target): cached = self.cache.get((qid, size, target)) if cached is not None: return cached result = self._compute_coverage(qid, size, target) self.cache[qid, size, target] = result return result def _compute_coverage(self, qid, size, target): if qid < 0: return size * size if qid == target else 0 half_size = size // 2 #print("getting", qid) return sum(self.coverage(kid, half_size, target) for kid in self.tree.get_quare(qid)) class Qanvas: """More convenient imperative interface.""" def __init__(self, size, background=-1): if size & (size - 1): raise ValueError("size must be a power of 2") self.size = size self.tree = Quadtree() self._coverage = CoverageComputer(self.tree) self.qid = background def snapshot(self): return self.qid def restore(self, snapshot): self.qid = snapshot def print(self): """Draw the pixel data represented in ASCII art.""" ascii_draw(self.tree, self.qid, self.size) def fill(self, x, y, width, height, color): """Fill a rectangle with color color at widthxheight+x+y. That is, horizontally from x to x+width, exclusive, and from y to y+height, exclusive. """ if color >= 0: raise ValueError("colors must be negative") self.qid = self.tree.update_rect(((x, y), (width, height)), val=color, size=self.size, qid=self.qid) def fill_masked(self, x, y, width, height, color, target): """In the rectangle at widthxheight+x+y, replace 'target' with 'color'. If target is the background color (default -1), the result will be that the filled rectangle is drawn, in effect, “underneath” any existing rectangles in that area. """ if color >= 0 or target >= 0: raise ValueError("colors must be negative") self.qid = self.tree.update_rect_masked(((x, y), (width, height)), val=color, size=self.size, qid=self.qid, target=target) def coverage(self, target): """Count how many pixels are colored the target color.""" if target >= 0: raise ValueError("colors must be negative") return self._coverage.coverage(self.qid, self.size, target) def crude_performance_test(): from time import time class Timer: def __init__(self): self.t = time() def lap(self): then = self.t self.t = now = time() return now - then def print(self, what): print(what, "%.3f" % (self.lap()*1000), "ms") from random import randrange t = Timer() q = Qanvas(2048) t.print("instantiation") t.print("nop") t.print("nop") for i in range(0, 2048, 2): q.fill(i, 0, 1, 2048, -2) q.fill(0, i, 2048, 1, -3) t.print("filling with 2048 skinny lines") print(q.coverage(-1)) t.print("counting coverage of -1") print(q.coverage(-2)) t.print("counting coverage of -2, vertical") print(q.coverage(-3)) t.print("counting coverage of -3, horizontal") s = q.snapshot() t.print("taking snapshot") q.fill(0, 0, 2048, 2048, -4) t.print("filling with -4") print(q.coverage(-4)) t.print("counting coverage of -4") q.restore(s) t.print("restoring snapshot") print(q.coverage(-4)) t.print("counting coverage of -4 again") print(q.coverage(-2)) t.print("counting coverage of -2 again") q.fill_masked(0, 0, 2048, 2048, -4, target=-1) t.print("fill_masked with -4") print(q.coverage(-4)) t.print("counting coverage of -4 again") t.print("nop") t.print("nop") print("total entries", q.tree.total_entries()) q.fill(0, 0, 2048, 2048, -1) t.print("erasing again") drawn_lines = 0 for i in range(2048): if randrange(2): q.fill(i, 0, 1, 2048, -2) drawn_lines += 1 if randrange(2): q.fill(0, i, 2048, 1, -3) drawn_lines += 1 print(drawn_lines, "random lines drawn") t.print("drawing random lines") q.fill_masked(0, 0, 2048, 2048, -4, target=-1) t.print("fill_masked on random grid") if __name__ == '__main__': crude_performance_test()