New Fun Blog – Scott Bilas

Take what you want, and leave the rest (just like your salad bar).

Adding Attributes to Enums: Background

without comments

I’m finally getting back to the enum series.

What I mean by “Attributes”

I’m using “attribute” in the modern language sense – metadata that you can attach to a symbol. C# got it dead right with their implementation. In C#, attributes can be attached to types, function parameters, constants, methods, just about anything. Then you can query them back fairly cheaply to do operations based on them from within the game or a tool. The XML serialization system uses this pretty heavily to let you customize how a class gets serialized to/from XML. Attributes are used all over the .NET runtime to do interesting and useful things. Wouldn’t it be nice to get the same from C++?

In C++ it’s not directly supported by the language, but we can fake it with macros, templates, or compile-time tools, or some combination. Any game that has any form of run-time typing information probably uses something like a DECLARE_RUNTIME_CLASS( classname ) to build tables about class inheritance and members. We could do something similar to declare file-scope statics that tack metadata onto whatever we like.

For this series, I’m limiting my scope to just enum constants. Why? Because they’re a pain to attribute. When you’re inside of an enum declaring constants, the only thing you can put in there is constants. So if you have a constant and you want a string equivalent to it or attach other metadata to it, you must wait until the end of the enum definition to do that. The more constants in the enum, the further away your metadata gets from what it is describing, and the easier it is to get out of sync or introduce subtle errors. Not to mention loss of readability.

Quick detour

Before I go into this, I need to switch examples. The EGame enum from my previous posts is too contrived. Let’s choose a much more practical example, one from my past: bytecodes for the virtual machine of a roll-your-own scripting language. I’ve written several of these, but I definitely don’t recommend writing a scripting language today. Just use or adapt Lua.

Let’s use this simplified enum:

enum Op
{
    OP_ADD,
    OP_SUBTRACT,
    OP_MULTIPLY,
    OP_DIVIDE,
    OP_ISEQUAL,
    OP_ISGREATER,
    OP_LOAD,
    OP_STORE,
    OP_POP,
    OP_PUSH,
    OP_SWAP,
};

Pretty straightforward. Some operations to do a little math, comparisons, and variable/stack manipulation. A real Op enum would have many more opcodes. The Skrit language I wrote for Dungeon Siege had perhaps 60 and Sheep from Gabriel Knight 3 had around 40. Lua 5.1 has 38. The size matters. If we only had 5-10 items in our enum, it probably wouldn’t be worth doing all this mess I’m about to talk about.

Attributes we need

For each of these opcodes, we’re interested in knowing several things:

  1. The binary representation
    • The Op enum constant is usually the opcode itself. Usually this means casting it to the native instruction size of your VM, most often a byte.
    • With more complicated op schemes that have variable-length instructions, the most famous example being x86 assembly, there may be some processing required just to figure out the size of the op. This makes compiler and VM construction a lot more complicated. It also may stand in the way of a simple set of lookup tables to attach attributes to opcodes. If the possible set of opcodes is huge due to the number of combinations of prefixes and micro-ops, then the table-based approach I’m going for in this series is probably not going to work. An algorithmic or perhaps a combined algorithm/table approach would be required.
  2. A string representation of the opcode
    • This is important for printing out error messages or for dumping a disassembly of the compiled source. So OP_ADD becomes “OP_ADD”. And in our VM’s error handler we can do a mixed-mode disassembly with the aid of a ToString():
    • printf("%04X %02X %s\n", offset, op, ToString(op) );

    • We may also want to have a more human readable description if we’re generating low-level help automatically. So OP_ADD would need “Pop top two stack items and push the sum”. Lua for example just embeds this as a comment. For example, here’s their OP_ADD, as well as a more complicated op that I copy-pasted from lopcodes.h:
    • OP_ADD,/*    A B C    R(A) := RK(B) + RK(C)                */
      OP_FORLOOP,/*    A sBx    R(A)+=R(A+2);
                  if R(A) <?= R(A+1) then { pc+=sBx; R(A+3)=R(A) }*/

  3. The size of the trailing data
    • Lots of things are easier if you can easily ask how big the entire op + its data is, like disassembly, or patching the bytecode at runtime to make breakpoints, etc. OP_ADD will require different data from OP_RETURN, for example.
  4. Flags regarding each op’s behavior
    • Perhaps your VM is statically typed and the type info is lost at compile time. You may require different OP_ADD variants for integer vs. float vs. string data – Skrit had OP_ADDI, OP_ADDF, and OP_ADDS. If this is the case, it would be convenient to attach flags to your ops to say what type of data the op works on.
    • As a practical example, in Skrit, when I was compiling a two-data operation like + or || I would call a general emit function that could determine if conversions were required by either side based on traits attached to the op. If TRAIT_LOGIC was set I’d only permit ints and bools as the data. For TRAIT_MATH opcodes I’d check both sides to see if they were compatible and emit promotions for one or the other if not. It was convenient to be able to tell if a proposed op was string-compatible by checking TRAIT_STRING.

So for a given enum constant, we want to be able to gather each of these attributes, and quickly. Next article I’ll go into the implementation details.

February 2nd, 2008 at 12:28 pm

Posted in c++, enum

Leave a Reply

Note: This post is over 4 years old. Time moves fast on the internet and this article may be total bunk now! You may want to check later in this blog to see if there is any new information relevant to your comment.

Want to paste some code into your comment? Just wrap it in [code] [/code]. Also, please note that off-topic or overly commercial comments will likely be removed at my discretion.

Switch to our mobile site