Dec. 29, 2016, 4 p.m.

Skeleton Meshes

(Break it all apart!)

This post explores a slightly unusual design choice for some key data structures at PathEngine.

More specifically, it's about polygon mesh data structures (in C++), and follows on from a previous post about polygon meshes 'in the style of std::vector'.

One of the consequences of a vector-like approach is that mesh elements can be referenced by index, and then accessing elements by index means we can break stuff apart.

It's a story of violated object oriented programming principles, broken encapsulation, and unsafe code, but the code we end up with is then more functional and a lot more fun to work with!

(This is part of a series about meshes in PathEngine, as well as another series about vectors and vector based containers.)

The story so far

There are good reasons to prefer vector-like data structures to list-like data structures.

This is covered pretty well in 'why you should never ever ever use linked list in your code again', but it mainly comes down to maximising cache utilisation.

It turns out that we can 'host' a set of linked nodes in a contiguous buffer to get both vector-like performance and list-like dynamic update.

And then, putting face, edge, and vertex connectivity information each into their own vector hosted list gives us a kind of 'vector mesh'.

But what about the contents?

The 'vector mesh' class I showed previously only includes connectivity information.

But in practice you're going to want to store data for elements in a mesh (coordinates for mesh vertices, surface attributes for mesh faces, and so on).

This post discusses the approach taken for this kind of mesh content data, at PathEngine, some of the reasons why I decided to go this way, and some implications of the approach.

PathEngine meshes

So we need to work with a bunch of different meshes, each with different associated content data.

There are 'ground meshes', for example (representing actual 3D ground surfaces), and '2D meshes' (with height detail stripped out), but also a bunch of other mesh representations, corresponding to different stages of our 3D content processing pipeline.

And then we have a load of generic, shared code for mesh operations, that we don't want to rewrite for each mesh.

Templated containers

One solution is to make our meshes into templated containers (further extending the analogy to std::vector).

For example, the simple vector mesh I posted (for push_back() style construction) started off as follows:

#include <vector>
#include <cstdint>

class cTriMesh
{
  // {twin edge, edge vert} * 3 * number of tris
    std::vector<int32_t> _edgeInfo;
  // the first edge around each vertex
    std::vector<int32_t> _vertFirstEdge;

//... rest of class implementation
};

We can add some template parameters to this mesh class, and make this into a container, as follows:

#include <vector>
#include <cstdint>

template <class tVData, class tEData, class tFData>
class cTriMeshContainer
{
  // {twin edge, edge vert} * 3 * number of tris
    std::vector<int32_t> _edgeInfo;
  // the first edge around each vertex
    std::vector<int32_t> _vertFirstEdge;

  // data 'contained' by the mesh elements
    std::vector<tVData> _vertData;
    std::vector<tEData> _edgeData;
    std::vector<tFData> _faceData;

//... rest of class implementation
};

For a dynamic vector mesh (as discussed here) the vectors would be replaced with 'vector hosted lists' (such as cVectorWithDeadEntries).

And a full implementation will need to provide methods for access to the contained data, and make sure the content data is updated appropriately after any changes, but you get the general idea.

For clarity, let's call this approach container meshes.

An alternative approach

When I first wrote PathEngine I started out with a (fairly standard) pointer based dynamic mesh data structure, and in that context the 'container meshes' approach is pretty much the only way to go.

But then, as the SDK evolved to better meet customer performance requirements, these data structures got reworked as 'vector meshes'.

Vector meshes use contiguous buffers internally. This means that you can access elements directly by index, and this opened up the possibility for an alternative approach.

Skeleton meshes

The 'trick', as far as it goes, is to leave the mesh data out of the mesh class, and have this data represented somewhere else (in a kind of 'parallel array').

So a mesh with vertex coordinates might look like this:

// so this is just a tri mesh
// with no template parameters for content data types
cTriMesh mesh;
// and then content data is represented separately
// (e.g. an array of points, with one entry per vertex)
vector<cPoint> vertexPoints;

This is possible because (unlike the pointers to nodes in a traditional pointer mesh data structure) a single index can refer to both elements in the mesh itself and entries in our independent coordinate data vector.

Code to access this data might then look something like this:

cPointDiff EdgeAxis(
    const cTriMesh& mesh,
    const vector<cPoint>& points,
    int edge
    )
{
    int vert = mesh.vert(edge);
    cPoint startP = points[vert];
    edge = mesh.next(edge);
    vert = mesh.vert(edge);
    cPoint endP = points[vert];
    return endP - startP;
}

(I'm using the tri mesh class and interface from my previous post, but the exact details of the mesh implementation and interface are not really important.)

There's a bit more to it, as we'll see later, but again, you get the idea.

Let's call this approach skeleton meshes.

Skeleton meshes FTW!

So, soon after switching to vector meshes, I made a decision to also switch PathEngine over to skeleton meshes, even though this meant fairly sweeping changes to a whole bunch of code. All of the mesh data structures used by PathEngine are now implemented as skeleton meshes.

Bad practise?

Hold on a minute, are you crazy? (You might ask, at this point.)

There are some good reasons why the skeleton meshes approach should be considered bad practise.

We're told to prefer iterators over direct indexing, for one thing.

And then skeleton meshes break encapsulation. This is anti-object-oriented, and makes mesh related code less 'safe'.

But, my experience is that skeleton meshes worked out really well for us.

Container meshes were a nightmare to work with, in practice. And switching to skeleton meshes solved some specific problems for us, and resulted in cleaner, more maintainable, more robust code.

So there's a discrepancy between some common ideas about good ways to structure code, and my practical experiences of what actually worked best. And that's worth looking at more closely, I think.

(Software architecture is important, and difficult to get right. We need heuristics to help us with this, but we should remember that these are only heuristics, and be constantly looking to refine and improve these heuristics based on our experiences, and on other people's ideas.)

Let's look at how this works out, then, in some more detail..

Prefer iterators?

Well look at the iterator/indexing side of things first of all (as I think this is kind of a lesser issue).

So there are two main reasons, as I see it, to prefer iterators over raw indexed access.

The first reason is to abstract away from concrete implementation details of the target container, and the second is related to the fact that iterators tend to be strongly typed.

Abstracting away from concrete container type

If I have some code that steps through a vector, and suddenly find I have to work on a linked list instead, code based on iterators will be easier to adapt than code that accesses the vector directly by index.

(This is the main reason given in the top answers to this stack overflow question about why iterators should be used instead of array indices.)

But in PathEngine we kind of just took the nuclear option, and went ahead and changed all of our concrete mesh implementations to support direct indexed access. We no longer have any pointer based mesh data structures, and so situations analogous to switching from vector to linked list just don't arise.

Note that we still abstract away from implementation details, to some extent, by allowing for 'dead entries'. (The idea of dead entries, and a 'consolidate' operation are the basis for our vector hosted list data structure, and dynamic vector mesh implementation.)

And because we know that we can always use direct indexed access, we can go ahead and take advantage of this to implement faster and/or simpler versions of mesh algorithms.

Iterator types

The second reason to prefer iterators is the fact that iterators tend to be strongly typed.

This was actually an issue for us when migrating from our old vector mesh code to skeleton meshes. The iterators we used to access our mesh data structures before the change were typed, with this type including both the type of element being iterated over and the type of the containing mesh.

So the compiler was able to catch a whole class of potential coding errors as type mismatches.

Simply replacing these iterators with raw integer values would mean losing all this compile time checking, and was not acceptable.

Typed indices and vectors

If you look back at the EdgeAxis() function I posted above, index values are stored as simple integers, and 'edge' and 'vert' have exactly the same type. But it's easy enough to improve on this.

In this previous post I already introduced some typed index classes, using a 'tag dispatch' mechanism, to avoid confusion between vertex, edge and face index values (or between these index values and other integer values).

(You can find some more discussion about tag dispatch on this recent blog post, and on an associated reddit thread.)

Our mesh index types are named cFace, cEdge, cVert (short names because these get used a lot).

And then we also do something similar, for typed container classes.

I won't go into implementation details but the key point is that have custom typed containers for storing mesh data, and we get type checking for index values, as follows:

cPointDiff EdgeAxis(
    const cTriMesh& mesh,
    const cVertData<cPoint>& points,
    cEdge edge
    )
{
    // ** compiler error **
    //cPoint startP = points[edge];

    cVert vert = mesh.vert(edge);
    // ** ok **
    cPoint startP = points[vert];

    // ** compiler error **
    //edge = mesh.next(vert);

    // ** ok **
    edge = mesh.next(edge);

    vert = mesh.vert(edge);
    cPoint endP = points[vert];
    return endP - startP;
}

Our typed mesh data containers are named cFaceData, cEdgeData, cVertData, each templated on contained element type, with interface and buffer allocation semantics similar to std::vector but with each container only indexable by the corresponding mesh index type.

Index type is independent of mesh type

Our index types are still independent of mesh type, and we lose that aspect of iterator type-checking.

There were some quite specific cases with code that references multiple mesh structures at the same time, and with variables like 'e1', 'e2', 'candidateEdge' and so on, where the lack of clarity about exactly which mesh these referred to was a bit of a pain.

But it turns out that this is a trade-off, and the possibility to use a single index value to refer to multiple meshes is actually a significant benefit in a bunch of other cases, as we will see later..

Other benefits of indexed access

And then there are some other benefits of using index values instead of iterators, in particular the possibility to save and load index values directly to and from persistence.

In general, I'm pretty happy with the iterators/indexing side of things, so let's move on to the other objections..

Failure to encapsulate

The remaining issues essentially boil down to the fact that skeleton meshes break encapsulation (and in doing so make calling code less safe).

One thing that helps illustrate this problem is the concept of class invariants. But, more generally, the issue is that changes in mesh structure are kind of naturally linked to changes in the organisation of associated mesh data.

Class invariants

A nice thing about object oriented programming is the possibility to encapsulate operations in a class in a way that guarantees that some kind of class invariants always remain true.

Stroustrup is very keen on classes enforcing invariants. This is advice I try to follow, and something that has helped me on many occasions when debugging complex and difficult behaviours.

But consider the following code, with the 'skeleton mesh' approach:

void AddTri(
    cTriMesh& mesh,
    cVertData<cPoint>& points,
    cFaceData<cFaceAttributes>& attributes,
    cPoint p1, cPoint p2, cPoint p3
    )
{
    cVert v1 = points.push_back(p1);
    cVert v2 = points.push_back(p2);
    cVert v3 = points.push_back(p3);
    mesh.addTri(v1,v2,v3);
    //! forgot to update attributes
    //! to add an entry for the new face
    //! bad things may happen!
}

With an (object oriented) container mesh approach we could make it a class invariant that the size of the face attributes data vector always matches the mesh 'face size', and check and enforce this at the end of every modifying method.

Other linkage between mesh and mesh data

It's not just about the size of the content data vectors, though, and the objection is not just about keeping mesh and data in some measurably 'valid' state.

We also need to ensure that the values (and ordering of values) in our data vectors match the corresponding mesh elements.

If some code swaps the position of two faces in a mesh, for example, mesh data entries for those elements should also be swapped.

With a container mesh approach, the code for swapping faces would naturally also have the responsibility of swapping the corresponding data entries.

Lower level code

Skeleton meshes effectively externalise the responsibility for maintaining data size invariants, making sure that mesh data matches mesh structure, and so on, and writing code to take care of this externally can feel like writing lower level code.

Lower level, ..or more functional style?

Actually, I think this felt like low-level programming because I come from a low level coding background.

Luckily, I'm quite happy with using a lower level programming style, when it's necessary to get things done, or when this just feels like a better way to set things up, and the fact that skeleton meshes felt like a lower level coding style didn't put me off taking this approach.

But since then I've also played around a bit with functional programming (in Haskell, for example), and skeleton meshes also turns out to be a much more 'functional programming style' approach to the problem.

In Haskell (and other functional programming languages) encapsulating data into classes (and defining invariants) is much less of a thing, and instead it's more important to specify the exact data flow for your processes, and this is also how the skeleton meshes approach works out.

Swings and roundabouts

It turns out that there's something of a trade-off here, and while breaking our mesh and data apart certainly does break encapsulation, there are also a whole bunch of advantages that come out of this, in particular if you want to take more of a functional programming approach.

Only reference the bits you need

Consider a simple function to compute the two-dimensional extents of mesh geometry:

// container mesh version
cHorizontalRange GetExtents(
    const cTriMeshContainer<cPoint,cEdgeData,cFaceAttributes>& mesh
    );

// skeleton mesh version
cHorizontalRange GetExtents(
    const cVertData<cPoint>& vertexPoints
    );

The container mesh version introduces a bunch of dependencies that aren't actually required for the task at hand.

The cEdgeData and cFaceAttributes types need to be defined, for example, as well as the mesh container type, bringing in extra include files, increasing code linkage and compilation times, and so on.

But we've also had to limit the function to this specific type of mesh. If we need extents for a mesh with a different edge data type, later on, we won't be able to call this function as currently defined. Instead we'll end up templating it by mesh type, or otherwise complicate the situation, even though edge data isn't relevant to the function.

The GetExtents() method is kind of an extreme case, since we don't need to access the mesh data structure itself at all.

In some other extreme cases, code may only need to operate on the mesh itself (ignoring content data completely), e.g.:

void EraseConnectedComponents(
    cDynamicMesh& mesh, cFace startF
    );

And then there are a bunch of situations where code operates on the (skeleton) mesh and just some of the associated data components..

More granular control of mutability

As well as finer control over dependencies, breaking things apart also gives us the possibility to control how individual parts of a mesh are accessed.

For example, we can see directly from this function signature that the mesh itself is not modified, only face data:

// set face attributes for
// a connected subset of a mesh
void FillFaceAttributesTypes(
    const cTriMesh& mesh,
    cFace startFace,
    cFaceAttributes fillWith,
    cFaceData<cFaceAttributes>& faceData
    );

Only test and debug the bits you need

So there are various ways to specify data flow for mesh operations more clearly and this had a bunch of other important knock-on effects, apart from dependency reduction.

In particular, I find minimal and more specific operations much easier to test and debug.

As well as making it easier to maintain a mental overview of exactly what's going on in the code for a given operation, having more minimal and specific data flow also makes it much easier to construct repeat cases while debugging, and fits in very nicely with a functional approach to unit testing.

Encapsulate processes?

In the PathEngine source code, the code for each of these mesh operations (GetExtents(), EraseConnectedComponents(), FillFaceAttributesTypes() etc.) gets split off into its own C++ object file, with minimal dependencies, and a clear 'functional' interface.

And then, when we reduce the amount of data passed in to these functions, or constrain the possibilities for state changes, what we're doing is making a narrower 'interface' to these processes, and making our code base more modular.

You can think of this as a different kind of encapsulation, then, but where we are now encapsulating processes instead than data. (Which is more important?)

Avoiding mutation

The skeleton meshes approach really shines where at least some of the data involved is immutable.

But I found that having the possibility to split mesh and data also resulted in my thinking differently about some of our data flow, and reworking processes so that modifications to mesh state can be restricted to one part of the mesh.

Consider something like this, for example

// simplifies the mesh by removing edges
// where the resulting faces remain convex
template <class tE, class tF>
void ConnectConvexPolys(
    cDynamicMesh& mesh,
    cVertData<cPoint>& vertData,
    cEdgeData<tE>& edgeData,
    cFaceData<tF>& faceData
    );

Internally, this is going to call a joinFaces() method, on the dynamic mesh, to connect two faces sharing an edge into a single face.

This deletes one of the original faces, as well as the connecting edge (or the two connecting half-edges, to be precise), but never creates new mesh elements.

The dynamic mesh handles entry deletion by marking entries as dead, with a subsequent call to consolidate() to remove the dead entries and reorder everything else.

Assuming the mesh data entries are POD ('plain old data') types the only place where ConnectConvexPolys() actually needs to modify the data arrays is in that final consolidate operation.

Pulling out the consolidate operation then enables us to rewrite the method definition as follows:

// simplifies the mesh by removing edges
// where the resulting faces remain convex
// ** leaves dead entries
//    for deleted faces and edges **
void ConnectConvexPolys(
    cDynamicMesh& mesh,
    const cVertData<cPoint>& vertData
    );

Restricting the possibilities for mutation like this makes me quite happy, and sometimes I'd consider taking this even further, e.g. passing in a virtual interface that specifically only allows that joinFaces() operation!

Best of both worlds

Avoiding mutation is great, where this is possible, but there are cases where processes just do need to modify mesh and data all together.

It turns out we can kind of get the best of both worlds, then, by encapsulating our skeleton mesh classes and data together in a kind of 'container mesh wrapper', where this is necessary.

The wrapper class we use for this is set up as follows:

template <class tMesh, class tF, class tE, class tV>
class cMeshAndData
{
    // ** not owned, just references essentially **
    tMesh* _mesh;
    cFaceData<tF*> _faceData;
    cEdgeData<tE>* _edgeData;
    cVertData<tV>* _vertData;

//... rest of class definition
//... with methods that operate on mesh and data at the same time
};

There are a bunch of methods in this class, with the same names as modifying methods in our mesh classes, but with code to also update the mesh data accordingly.

For example:

cVert
cMeshAndData::splitEdge(cEdge e, tV vertValue)
{
    cEdge twinE = _mesh->twin(e);

// inserts a new vertex in the middle of an edge
// and returns the index of the newly created vertex
    cVert v = _mesh->splitEdge(e);

// according to expected semantics for _mesh->splitEdge()
// edges before the split, on either side
// should keep their original indices
// with new index values assigned to edges
// after the split

    if(_mesh->edgeSize() > _edgeData->size())
    {
        _edgeData->resize(_mesh->edgeSize());
    }
    (*_edgeData)[_mesh->next(e)] = (*_edgeData)[e];
    if(IsValid(twinE))
    {
        (*_edgeData)[_mesh->next(twinE)] = (*_edgeData)[twinE];
    }
    if(_mesh->vertSize() > _vertData->size())
    {
        _vertData->resize(_mesh->vertSize());
    }
    (*_vertData)[v] = vertValue;
    return v;
}

and:

void
cMeshAndData::breakEdge(cEdge e)
{
    cEdge twinE = _mesh->twin(e);
    _mesh->breakEdge(e);
    // if this breaks the mesh,
    // new vertices may be created
    if(_mesh->vertSize() > _vertData->size())
    {
        _vertData->resize(_mesh->vertSize());
    }
    // new verts should then be assigned to twinE
    // (according to expected semantics for breakEdge())
    // either or both of the following assignments can then be self-assignments
    (*_vertData)[_mesh->vert(twinE)] = (*_vertData)[EndVert(_mesh, e)];
    (*_vertData)[EndVert(_mesh, twinE)] = (*_vertData)[_mesh->vert(e)];
}

So there's some kind of semantic linkage there, with these methods needing to know some quite specific details about the semantics of the underlying mesh operations.

You could say that this makes the base mesh interface something of a leaky abstraction, but I see this a bit differently.

I see this more as well defined semantics.

I like the fact that exact behaviour of these underlying mesh operations is well defined, and I feel like this is beneficial for other mesh related code. I think that depending on exactly specified behaviour can make other higher level code simpler and more efficient.

Mesh data update can depend on the data model

Data updates are not always tightly linked with corresponding mesh operations, however. Sometimes this depends on the exact data model being applied.

For example, there are two ways for us to interpret a base flipEdge() operation, depending on what exactly our edge data represents:

void
cMeshAndData::flipEdge_StartOfEdgeModel(cEdge e)
{
// updates edge data from new neighbours
// i.e. corresponds to 'edge data at vert' model,
// such as where edge data holds vertex Z
    cEdge twinE = _mesh->twin(e);
    assert(IsValid(twinE));
    _mesh->flipEdge(e);
    (*_edgeData)[e] = (*_edgeData)[_mesh->next(twinE)];
    (*_edgeData)[twinE] = (*_edgeData)[_mesh->next(e)];
}
void
cMeshAndData::flipEdge_MidEdgeModel(cEdge e)
{
// leaves edge data as it was!
    _mesh->flipEdge(e);
}

For these cases, and for different mesh construction situations, it can actually be preferable to abstract details for how mesh content should be updated away from the core mesh data structure.

Encapsulation, when you want it

So we get the possibility to split things apart, when that makes sense, but we can also choose to encapsulate operations on mesh and data, when we want this encapsulation, with a separate layer of code to specify how exactly mesh data should be updated after mesh changes.

In some cases the extra layer provided by cMeshAndData is less natural than integrating data update into the mesh itself, but there are also cases where this can be more natural, and in practice our container meshes ended up being something of a monolithic horror, so from that point of view anything that lets us split this into separate bits of code in a reasonable way is a good thing.

External 'invariant'

It turns out it's not so hard to add an invariant to our mesh and data, as well, when we want this, without this having to be part of a class definition.

We can get essentially the same kinds of benefits for debugging complex situations by adding something like this:

template <class tMesh, class tF, class tE, class tV>
bool DataSizesMatch(
    const tMesh& mesh,
    const cFaceData<tF>& faceData,
    const cEdgeData<tE>& edgeData,
    const cVertData<tV>& vertData
    )
{
    return
        mesh.faceSize() == faceData.size() &&
        mesh.edgeSize() == edgeData.size() &&
        mesh.vertSize() == edgeData.size();
}

This is harder to add everywhere where mesh or data is modified, but it's doable when you need it.

And, of course, the cMeshAndData wrapper can get a more standard class invariant, as well.

Other benefits

There are some other advantages to the skeleton mesh approach, in our specific case.

Sharing data between meshes

With mesh and data as separate objects it's possible for two or more skeleton meshes to share data that would otherwise need to be replicated with each mesh.

A concrete example of this a simplified version (or retriangulation) of an original mesh, built around the same set of vertex points. Sharing data in cases like this can then be a big win, for both performance and memory footprint.

Avoiding templates

Skeleton meshes made it possible for a lot of mesh related logic in PathEngine to be expressed through non-template code, where previously this had to be templated.

Templates are great for avoiding code repetition, but come at some cost. Non-template code is just that bit easier to work with, but as well as reducing cognitive load, avoiding templates can also significantly reduce build times and executable size.

Mesh specialisations

Splitting the core mesh data structure out from associated content data (and content data types) made it possible for us to work with multiple specialised versions of these core mesh data structures, in a way that just would not have been manageable with the previous container mesh setup.

Conclusion

Breaking our container meshes apart meant breaking encapsulation, but this was definitely the right way to go for PathEngine.

Encapsulation is a good thing, up to a point, but sometimes pure object oriented approaches can lead to big old monolithic classes that get difficult to work with, fast.

Sometimes you need to split things apart, and make things a bit more functional!

Laying data components out in parallel, with a shared indexing strategy, can be one way to do this.