// R3cts: linear-time table layout solver with no dynamic allocation. ꙮ
//
// The problem: you have a potentially large set of layout constraints
// of the form xᵢ ≥ xⱼ + k.  Implicitly x₀ = 0.  You want to find the
// smallest set of xᵢ that satisfies these constraints.  i and j are
// not too big; they might be up to a few thousand, but will more
// typically be up to, like, 8.
//
// We use the same solver for horizontal layout and vertical layout,
// but WOLOG I’ll explain the horizontal case.
//
// The constraint xᵢ ≥ xⱼ + k arises when some table cell starting in
// column j and ending in column i - 1 ≥ j needs a minimum width k,
// including any borders and margins and padding.  Often i - 1 = j,
// but if you have table cells that span across multiple columns, it
// might be much greater.  Consequently the number of constraints may
// be quadratically larger than the number of values that need to be
// computed.
//
// The algorithm has two passes: first, we load up all of the
// constraints, and second, we solve for each of the xᵢ from first to
// last.  Each constraint is visited once in each pass, in constant
// time, so the algorithm as a whole takes linear time.  On one core
// of my laptop it solves about 200
// million constraints per second under normal conditions, though it
// might drop to only ten million with bad enough locality.


// An assertion that position[end] ≥ position[start] + width.
// Can only be added to one r3_system at a time.
typedef struct r3_constraint {
  // Requirement: start < end.
  int start, end, width;
  // prev is a link to previous constraint with same `end`.
  struct r3_constraint *prev;
} r3_constraint;

typedef struct {
  int n;          // maximum number of values to compute, at least 1
  int m;          // actual number of values computed; initialize to 0
  r3_constraint **latest; // points to a null-initialized array of size n
} r3_system;

// Both of the possible errors from r3_add_constraint:
static const char *r3_err_n_too_small = "constraint end exceeds system n";
static const char *r3_err_constraint_backwards =
  "constraint start must precede end";

// Returns NULL on success, error message string on failure
static inline const char *
r3_add_constraint(r3_system *r3, r3_constraint *c)
{
  int end = c->end;
  if (end >= r3->n) return r3_err_n_too_small;
  if (c->start >= end) return r3_err_constraint_backwards;

  // Add c to linked list for its end.
  c->prev = r3->latest[end];
  r3->latest[end] = c;
  if (r3->m <= end) r3->m = end+1;
  return 0;
}

// Places r3->m solver results in position[].  Never fails.
static inline void
r3_solve_system(r3_system *r3, int position[])
{
  r3_constraint **latest = r3->latest;
  int m = r3->m;
  position[0] = 0;

  for (int i = 1; i < m; i++) {
    // Compute the position number i.  Minimal possible answer is 0.
    int pos_i = 0;

    // Relax the draft solution to satisfy each relevant constraint.
    // This inner loop follows an unpredictable pointer, but compiles
    // to 9 amd64 instructions.
    for (r3_constraint *c = latest[i]; c; c = c->prev) {
      int min_pos = position[c->start] + c->width;
      if (min_pos > pos_i) pos_i = min_pos;
    }
    position[i] = pos_i;
  }
}

/*
The above is designed to be called in contexts where failure is not an
option.  The only failure cases are from programming errors: invalid
constraints, or not allocating enough space in `latest` for the number
of items you’re trying to compute positions for.  This is by design,
because I want to use it in my terminal, and I don’t want my terminal
to crash.

In particular, allocating memory dynamically can fail, so it doesn’t
do that, even though that can make it less convenient to use.
Allocating heap memory in C also usually suffers the drawback that you
must deallocate the memory, but not too soon.  By not allocating
memory dynamically, r3cts avoids any need for an `r3cts_free_system`
or similar call.  The idea is to embed the memory for the constraints
in whatever other data structure pertains to the table layout.

I asked GPT-4 for suggestions for improving the above API, and I think
a lot of people are likely to think along the same lines as its
commentary; here’s what it said, along with my commentary.

> The API provided (...) could be improved in several ways to enhance
> usability, safety, and maintainability. Here are some suggestions:
> 
> 1. **Encapsulation:**
> 
>    - Provide functions to create and destroy `r3_system` instances
>      rather than requiring users to manage memory allocation manually.
> 
>    - Hide the implementation details of `r3_system` and
>      `r3_constraint` by using opaque pointers.

This would make R3cts more convenient to call, and it would enable
backwards ABI-compatibility with separate linking, but it would
require dynamic allocation inside the library, which I want to avoid.
If I didn’t, I wouldn’t be programming in a low-level language.  Maybe
I’ll add a wrapper that does this, though.

As a header-only library, this is the opposite extreme from separate
linking.  There’s no worry about backwards ABI-compatibility because
there’s no separate linking.  If I were to add more fields to
`r3_system` or `r3_constraint`, as long as 0 is a valid initial value
for them and the caller uses initializer syntax, a simple recompile
will work fine.

What will *not* work fine is code like this, which leaves any unlisted
fields of `r3_system` uninitialized:

    r3_system r3;               // BAD
    r3.n = n;                   // BAD
    r3.m = 0;                   // BAD
    r3.latest = calloc(n, sizeof(r3_constraint*)); // BAD

For backwards compatibility, you should *always* initialize it like
this, which guarantees initializing all unlisted fields to 0:

    r3_system r3 = { .n = n, .latest = calloc(n, sizeof(r3_constraint*)) };

...but eh, maybe GPT-4 was right that not providing an initialization
function tempts people to write code like the bad example above.

> 2. **Initialization Functions:**
> 
>    - Offer an initialization function for `r3_system`, which would
>      handle setting up the initial state correctly.

This may be a good idea, but the most critical aspect of setting up
the state is ensuring that the `latest` array is big enough, which
can’t be done in C without dynamic allocation.

>    - Similarly, provide a constructor-like function for creating
>      constraints.

I think this is not really useful.  You can initialize a constraint by
saying {37, 38, 105} or { .start = 37, .end = 38, .width = 105 }, and
a constructor call cannot compete with that for clarity.  The only way
a constructor-like function would be useful would be that it would
give you a compile error if you left out an argument.  And you
wouldn’t be able to use it for statically initialized constraint
structures.

> 3. **Error Handling:**
> 
>    - Use a more robust error reporting mechanism that can provide
>      detailed error messages or codes.
> 
>    - Return an enum or define constants for different error conditions
>      instead of just true/false.

This was a good idea, and I’ve applied it.

> 4. **Memory Management:**
> 
>    - Automatically resize internal structures if the number of
>      constraints exceeds initial expectations (`n`) rather than
>      failing.
> 
>    - Manage memory internally for constraints so that users don't have
>      to allocate them statically or on the heap.

As explained above, this would make the library unreliable, because
sometimes heap allocation fails, and require a deallocation
function — which the user could forget to call, resulting in a memory
leak.

> 5. **Validation:**
> 
>    - Add checks within `r3_add_constraint()` to ensure that constraint
>      indices (`start`, `end`) are within bounds and that `start <
>      end`.

I was already verifying that `end` was within bounds, and checking
`start` is probably a good idea (the cost was a single extra
comparison) so I added it.

> 6. **Consistency Checks:**
> 
>     - Implement consistency checks before solving, ensuring no
>       contradictory constraints have been added.

Because of the nature of the constraints, contradictory constraints
are impossible.

> 7. **Documentation:**
> 
>     - Provide comprehensive documentation comments explaining each
>       function’s purpose, parameters, return values, and any side
>       effects.

This is a good idea.

> 8. **Return Values & Output Parameters:**
> 
>     - Change functions like `r3_solve_system()` to return an error code
>       or status instead of void; use output parameters only for
>       actual outputs (like positions).

`r3_solve_system` cannot fail, and it was already returning its output
positions through a pointer parameter.

> 9. **Naming Conventions:**
> 
>     - Use consistent naming conventions throughout the API (e.g.,
>       prefix all public functions with `r3_` if that is the chosen
>       namespace).

I was doing this already, GPT-4 just didn’t notice.

> 10. **Thread Safety:**
> 
>      - Make sure that operations on different instances of `r3_system`
>        are thread-safe if multi-threading support is desired.

Ensuring that the same constraint isn’t used in two `r3_system`s at
once, in the same thread or different threads, is the caller’s
responsibility.

> 11. **Utility Functions:** 
> 
>      - Provide utility functions such as printing/debugging
>        constraints and systems for easier debugging by users.

It’s simple enough that this probably isn’t necessary, since there’s
no encapsulation.

> 12. **Examples & Tutorials:** 
> 
>      - Include example programs demonstrating common use cases and
>        best practices when working with this API.

This is a good idea.

*/