Introduction Wilfred Springer
Table of Contents 1. Introduction ................................................................................................................. 1 2. Declarative Binding ....................................................................................................... 1 3. Convention over configuration ........................................................................................ 3 4. More than just numbers ................................................................................................ 3 5. Composite content ....................................................................................................... 4 6. Inheritance .................................................................................................................. 5 7. Lists ........................................................................................................................... 6 8. Lazy loading lists .......................................................................................................... 6 9. Expressions ................................................................................................................. 7 10. Limbo ........................................................................................................................ 7 11. Conditionals ................................................................................................................ 8 12. Complex references ..................................................................................................... 8 13. Documentation ........................................................................................................... 9
1
Introduction Decoding a compressed data format in Java can be quite a mundane task. This chapter will hopefully convince you that it does not need to be all that complicated. Preon aims to provide a simple and powerful solution. This chapter will introduce the most important principles behind Preon. Don't expect this chapter to be an exhaustive reference guide for everything Preon has to offer. Completeness is not considered to be achieved if there is nothing else to add; it is considered to be achieved if there is nothing left to be taken out. The sole objective of this chapter is to prevent surprises once you start to use Preon.
2
Declarative Binding Preon is not just a library allowing you to access bits in a bitstream, or a library that has a number of convenience mechansisms allowing you read data from a compressed bitstream encoded format. It is much more than that, and - just like IoC - it obeys the Holywood paradigm: "don't call us, we call you." If Preon would just have been a library of functions facilitating different compression mechanisms, then you would have been required to load the decoded data into your own data structures yourself.
1
Introduction
But Preon turns that arround: you just provide it the blueprints of the data structure, and Preon will make sure the data gets loaded into the appropriate places. Now, the big question is of course: how does Preon know how to load data from a sequence of bits and reconstruct a data structure? What is the recipe? It turns out, the recipe is the data structure itself; it's just classes and attributes, period. And in case it the information provided by the data structure is not enough, we will just augment it using annotations. So, let's look at an example. Example 1, “First data structure” defines a data structure defining the coordinates of the two points defining a rectangle, in some coordinate system.
Example 1. First data structure class Rectangle { private int x1; private int y1; private int x2; private int y2; }
Let's just say that the data would be encoded on disk as four consecutive 32-bit integers. In that case, this is the way you would decode a Rectangle from a file:
byte[] buffer = new byte[] { 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0, 4 }; Codec codec = Codecs.create(Rectangle.class); Rectangle rect = Codecs.decode(codec, buffer);
That's how easy it is. Although... To be perfectly honest, the example is not entirely complete yet. If you would use the code as-is, you would not get anything at all. (Well, an empty Rectangle, maybe.) Problem is, Preon does not assume anything about your decoding requirements. In order to tell Preon that it needs to decode the x1, y1, x2 and y2 fields, you will need to explicitly state it in the definition of the data structure, like this:
Example 2. First data structure annotated class Rectangle { @Bound private int @Bound private int @Bound private int @Bound private int }
x1; y1; x2; y2;
If you run the same code now, you will get a populated Rectangle.
2
Introduction
3
Convention over configuration So, binding a Java data structure to its encoded representation seems to be fairly easy. However, Preon intends to support bitstream encoded data. That is, in many cases you want don't want to allocate the full 32 bits for decoding an integer, and you would just rely on a couple of bits instead. Or what if you don't want big endian byte order, but little endian byte order instead? All of this turns out to be additional configuration only. If you just specify the @Bound annotation, you basically tell Preon to use its default mapping to decode an integer. And by default, it will assume big endian byte order, and decode an integer using 32 bits. If you want to change that, you just need to tell Preon to do so, using other annotations, or (sometimes) other annotation attributes.
Example 3. Encoding integers using two bytes class Rectangle { @BoundNumber(size="16") @BoundNumber(size="16") @BoundNumber(size="16") @BoundNumber(size="16") }
private private private private
int int int int
x1; y1; x2; y2;
Example 3, “Encoding integers using two bytes” changes the example given before to make Preon allocate only two bytes (16 bits) per number. ??? illustrates how to change byte order.
Example 4. Encoding integers using little endian byte order class Rectangle { @BoundNumber(byteOrder=LittleEndian) @BoundNumber(byteOrder=LittleEndian) @BoundNumber(byteOrder=LittleEndian) @BoundNumber(byteOrder=LittleEndian) }
4
private private private private
int int int int
x1; y1; x2; y2;
More than just numbers Preon is not just about decoding numbers. By default, it already supports much more than that. Here are some examples:
// Will just read one bit, interpreting 1 as true, and 0 als false @Bound boolean visible; // Reads a String from a fixed number of bytes @BoundString(size="10") String value; // Reads a bit from the buffer, and interprets it as an enum value, // interpreting the number as its ordinal value. @BoundNumber(size="1") Type type;
3
Introduction
The first example is a good example of convention over configuration; an @Bound annoation on a boolean attribute will cause Preon to read one bit, and interpret it as true in case it's the value 1, and false otherwise. The String example illustrates one way of decoding Strings. Note that (in this case) the size attribute denotes the number of bytes and not the number of bits. In this case, the bytes read will be interpreted as US-ASCII. The third example illustrates how an type-safe enum value is bound to the bit buffer. Again, it relies on an existing annotation that we have already seen before: the @BoundNumber annotation specifies how a certain numeric value needs to be read. That numeric value read needs to correspond to the ordinal value of one of the enum's values.
5
Composite content The examples that we have seen so far were all pretty simple: in all cases, it was just about a sequence of attributes. But what if - in your format - you have have some higher-level conceptual organization of some elements? How would you need to deal with that? Let's take our Rectangle from Example 2, “First data structure annotated” as an example. What if we want it to have a fill and border color as well? Would this be the solution? @Bound @Bound @Bound @Bound @Bound @Bound
int int int int int int
fillRed; fillGreen; fillBlue; borderRed; borderGreen; borderBlue;
... or would you rather have this? @Bound RgbColor fillColor; @Bound RgbColor borderColor; I hope you agree with me that the second option would be the preferred one. And it turns out this is possible. If you define your RgbColor class as in Example 5, “RGB Color”, then you can used the RgbColor type as attributes in other classes.
Example 5. RGB Color class RgbColor { @Bound int red; @Bound int green; @Bound int blue; } It's imporant to emphasize that - from a processing expectations point of view - there is no difference between both options. Bot options will result in 6 numeric values being read consecutively. However, the second allows you to stick with the conceptual organization of your file, results in less code and makes it easier to maintain. Just to complete the example, Example 6, “Colored Rectangle” gives the source code of the Rectangle supporting colors.
4
Introduction
Example 6. Colored Rectangle class Rectangle { @Bound private RgbColor fillColor; @Bound private RgbColor borderColor; @Bound private int x1; @Bound private int y1; @Bound private int x2; @Bound private int y2; }
6
Inheritance In the previous section (Section 5, “Composite content”), we saw one way of factoring out commonalities. Preon allows you to factor out recurring snippets of content and use those definitions at various places. However, it is not the only way to organize content. In this section, I will show you another way. Example 6, “Colored Rectangle” defines just one shape. And of course, that shape also defines some color attributes. But what if your application defines more shapes. And what if you want all of those shapes to define the same color attributes? In order to facilitate scenarios like these, Preon supports inheritance. That is, subclasses will allways inherit the bindings defined by the superclass. This allows you to define color bindings on a Shape superclass, and have all of the other shape-specific attributes on the various Shape subclsses.
Example 7. Shape Inheritance class Shape { @Bound fillColor; @Bound lineColor; } class Rectangle extends Shape { @Bound private int x1; @Bound private int y1; @Bound private int x2; @Bound private int y2; } class Circle @Bound int @Bound int @Bound int }
extends Shape { centerX; centerY; radius;
When Preon is required to decode a subclass, it will always first process the inherited bindings. In case of a Rectangle, it will first decode the colors it inherited from Shape, and only then the attributes of the rectangle itself.
5
Introduction
7
Lists The 'struct'-type of support highlighted in the previous sections is an important concept, but not sufficient. We also need a 'sequence' concept. In Preon, the @BoundList annotation allows you to address all of these 'sequencing' concerns. And the annotation driven approach makes it fairly easy to use. Let's just say you want to store a pseudo-Mondriaan type of painting, existing of colored rectangles only. In that case, you could use code like this: class PseudoMondriaan { @BoundList(type=Rectangle.class, size="20") Rectangle[] rectangles; } The code given above simply states that - whenever you want to decode a PseudoMondriaan, read 20 Rectangle instances according to its specifications, and store them as them as the attribute 'rectangles'. Now, it is of course questionable if it is realistic to expect that every PseudoMondriaan contains exactly 20 rectangles. However, the @BoundList's size attribute also takes expressions that are evaluated at runtime. More on that in Section 9, “Expressions”.
8
Lazy loading lists Arrays are not the only type of lists supported by default; Preon also supports Lists (the java.util.List) kind, as you might have expected. There is however a difference with the way it treats Lists. Whenever Preon decodes a List of items from the bit buffer, there is a chance it will inject a lazyloading List into the List type attribute instead of fully-populated eagerly loaded List instance. I say there is 'a chance', since it really depends on a number of conditions if it will do so. The actual conditions are quite complex, but they can be summarized as "the principle of the least surprise." class PseudoMondriaan { @BoundList(size="20", type=Rectangle.class) private List rectangles; } Preon has a preference for loading data lazily. In the above case, the size of each Rectangle instance is known in advance. In fact, we know that the number of bits occupied by the entire List will be 20 times the size of single Rectangle. And we also know that - in order to get the third element - we just need to skip over the first two elements (the number of bits of a single Rectangle element, multiplied by two). Using a lazy loading List is in the above case probably a fairly efficient thing to do. There are other cases in which this would be much harder, typically when the List item's size is determined at
6
Introduction
runtime. In cases like these, Preon will automatically pick an alternative Codec, most likely one that eagerly loads the data.
9
Expressions By now, you probably wondered a couple of times why the size attribute on @BoundList and @BoundString accepts String values instead of numbers. It seems rather a wasteful, and the compiler is not going to help you to make sure you enter numeric values in these situations. The truth is, these attributes expect more than just numeric values: a lot of annotations in Preon allow you to pass in expressions, instead of numeric values. The size attribute on @BoundList and @BoundString are just examples of cases in which these expresions are accepted. class PseudoMondriaan { @Bound int numberOfRectangles; @BoundList(size="numberOfRectangles", type=Rectangle.class) List rectangles; } The example above is still a fairly simple example. It refers to a variable called numberOfRectangles, and it's now hard to guess how this variable is getting resolved: by default, Preon will try to resolve variables into bound attributes defined before. (There are other types of references, but let's stick with this simple example for now.) Now, you can obviously construct more complex expressions than this. And in order to do that, you can use a range of arithmetic or logical operators. Here are some examples: @BoundList(size="width * height") byte[] pixels; @BoundNumber(size="nrBits * 2") int value; // Clearly pointless, but you know... @BoundString(size="x * (y + z) / 23 ^ t");
10
Limbo The expression language used in the examples of the previous language is Limbo. Limbo is a fairly small expression language, and is explicitly aiming at not being as feature rich as JSP's EL or OGNL, and for good reason. First of all, JSP's EL and OGNL allow you to call methods on objects. Limbo does not allow you to call methods, because it wants to capture the meaning of the expression explicitly. And it would be impossible to capture the meaning of your expression if it would involve operations defined outside of Limbo itself. So, Limbo works on data only, and the number of operators is finite. Limbo is not just about evaluating expressions. It also wants to capture those expressions in such a way that it will be able to generate human-readable expressions from it afterwards. So, if these would be the Limbo expressions: width * height a^2 + b^2 == c^2
7
Introduction
... then Limbo wants to be capable of turning this into this: • width of the image times the height of the image • the sum of a to the power of 2 and b to the power of two equals c to the power of two In a couple of sections, we will see why this is relevant.
11
Conditionals The size attribute defined on the @BoundList annotation obviously acts as a type of control mechanism. However, in order to be able to deal with the majority of encoding formats, we need to introduce a couple of other control structures, in order to support with conditionals. Now, the number of places in which you would be able to use a condition is again open-ended. Preon is extensible, so you can define your own annotations using conditions wherever you like. However, the framework supports some by default, and we are going to mention two here. The first one is the @If annotations. By putting this annotation in front of an attribute, you basically state that - whatever attribute is following - it should only be decoded if the condition inside the @If holds. Now, let's see what that means in our example. Let's just say that a Shape allows you to specify that you want a dashed border, and that it defines some additional attributes allowing you to control how the dashes are drawn. class Shape { @Bound Color fillColor; @Bound Color borderColor; @Bound boolean dashed; @If("dashed") @Bound int spaceBetweenDashes; }
12
Complex references Section 9, “Expressions” already mentioned that Preon does not limit you to refer to other bound attributes on the same class only. It also supports the dot operator (to access attributes of objects referenced) and the item operator ( to access items in list-type attributes). Example 8, “Valid references” lists some sample references.
Example 8. Valid references a.b.c a.b[0].d a.b[e].d a.b[e * 21].d
8
Introduction
Another thing that Preon allows you to do is to refer to the outer context. Example 9, “Outer context references” relies on this to refer to the bitsPerColorComponent variable in the Image, from within the RgbColor object.
Example 9. Outer context references public class Image { @Bound int bitsPerColorComponent; @Bound int nrColors; @BoundList(size="nrColors") RgbColor[] colors; ... public class RgbColor { @BoundNumber(size="outer.bitsPerColorComponent") red; @BoundNumber(size="outer.bitsPerColorComponent") green; @BoundNumber(size="outer.bitsPerColorComponent") blue; } }
13
Documentation Various sections in this chapter already alluded to the fact that Preon aims at capturing dependencies between different pieces of data explicitly, in order to be able to generate documentation. If this is the first time you read about Preon, you might wonder what that was all about. Preon was born out of unease with an existing situation. In that existing situation, one guy was responsible for the software for decoding a complicated file format, while another was responsible for encoding data; both of them were responsible to keep documentation in sync. In the end, everything turned to be out of sync. Keeping the encoder in sync with the decoder was hard, and the documentation was always out of whack. Preon aims to change all of that: it set of to provide a framework in which you specify the encoding format once, and get decoders, encoders and documentation for free. So, how does it work? Well, if you remember the first example in this chapter, we constructed a Codec and used a Codec using the Codecs class, like this: Codec codec = Codecs.create(Rectangle.class); Rectangle rect = Codecs.decode(codec, buffer); Generating documentation just requires using another operation on Codecs: Codec codec = Codecs.create(Rectangle.class); Codecs.document(codec, DocumentType.Html, new File(...); When you apply that to Example 7, “Shape Inheritance”, this is what you get:
9
Introduction
Figure 1. Sample report
10