Another way to do datagen

Super late, writing this on my phone before bed.

What’s bad about datagen?

Can we do better

Facets

A “facet” is any aspect of any piece of content that needs some attention.

Here are some examples of facets:

Here are some facets that are less traditionally datagenned:

Facet Holders

A facet holder is anything with a bag of related facets. An “item” might have an associated lang entry, creative tab, and crafting recipe, for example, so it makes sense as a facet holder.

The purpose of creating specialized facet holders is building a domain-specific languages over facets. The facet for assigning an item model requires a model and an item to assign it to. When you are working within an item’s facet holder, you have the item ID available and don’t need to manually thread it through.

Importantly:

The domain-specific language for a recipe facet might prefill the recipe ID based off the crafting result’s ID, but you can always pick a different ID. Special-cases (like an item with two recipes or models or whatever) do not break the system.

Implementation

In templates I have a Tmpl holder which represents a block/item pair. The block ID is taken as argument and double-brace init idiom is used to execute code within that context. https://codeberg.org/quat/templates-mod/src/commit/d748e7ca0fe27b25e30f81915951d9553433330d/src/dgen/java/io/github/cottonmc/templates/dgen/Dgen.java#L64 .

So i make a bunch of Tmpls.

Then i create a single giant FacetHolder, pour all the facets into it (plus a few more that don’t belong to any Tmpl) and loop over each facet by type. Sometimes 1 facet = 1 json file, other times i loop over all facets of a type and collect then into a bug array which i write out as a json file.

Facet types themselves

Further work

I am very pleased with the locality.

Templates really only adds one kind of thing twenty times, which is probably why I can get away with this. I’ll have to wait and see how it works in a larger mod.

I want the source of truth to live entirely within the datagen system – i.e., I want datagen to drive the actual block registration code too. This means double-brace init will have to go, because I initialize a bunch of recipe shit that I have no business calling at runtime. It’s just a matter of splitting the double brace init into gentime and runtime methods and calling the appropriate one.

Instead of manually adding all FacetHolders into a list, I could scan my own classpath and look for classes with a certan annotation or a specific naming pattern (ex. looking for classes named $Gen means i can nest gen code for a block inside the block’s class). I could even datagen a list of gen classes and load that at runtime instead of scanning the classpath ;)

Sketch:

class MyBlock {
  public static final Id ID = MyMod.id("myblock");

  //puts the runtime representation
  //of MyBlock here as soon as it's available
  @Inject
  public static MyBlock inst;

  // block code ...

  static class Gen extends BlockFacetHolder {
    //the zero-arg constructor would grab
    //the block ID automatically from a field
    //called "ID". you could override it

    @Override void data() {
      dropsSelf();
      shapeless().add("minecraft:stick");
    }

    @Override void runtime() {
      registerBlock(...);
      blockEntity(MyBlockEntity.ID);
    }
  }
}

And remember that you can do whatever wherever, if you wanted to make a sixteen colors block you could just put loops inside these methods and ignore or reassign the block/item id fields

Lessons

Holder-baded registration

Addendum.

There’s a vanilla class called Holder<T> which pairs together a registry, an ID, and possibly an object of type T corresponding to that ID. Unbound holders do not contain a T and crash when trying to retrieve it. Bound holders do contain such a T.

In neoforge, many modders use the DeferredRegister<T> utility. You pass it a block ID and a block constructor, and in one step it creates an unbound Holder for the block, creates a task to construct/register the block at the appropriate time, and binds the holder to the block immediately atter registering it. These holders are then stuck into an easily accessible class where it’s easy to grab the block ID throughout the project. (Older versions of forge which predated Holder had the same idea.)

The problem is that when you use a Gen system, the block constructor call is deep inside your Gen code so it can’t be colocated with the holder pile. Instead, I propose a holder-first approach to doing your registration: instead of using DeferredRegister to create holders, just create unbound holders, and when you construct and register content, immediately bind them to all relevant holders.

Because an unbound holder doesn’t require a block constructor (it doesn’t really know what it’s registering yet), you regain the ability to stick them in convenient places.

Go a step further and make your register method take (Holder, T) instead of (Registry, ResourceLocation, T). Registration then becomes a problem about “binding holders” instead of a problem about “associating ids with things”. Small mindset shift.

In my project i actually created a clone of vanilla holder called Latch but only because I don’t trust mojang to keep it around. I could experiment with the vanilla class.