Tuesday, June 30, 2009

Weekly update for June 30th

The bug with coverage.py has been confirmed. The exact nature of the bug hasn't  been determined, but figleaf appears to be more reliable and will be  the source of coverage reports from here on.

Parser is nearly finished now.  All that remains is to get the logic in-line with the message protocol, which should not take more than a few days.  Once this is complete work on Client/Server/common will begin; for real this time.

Thursday, June 25, 2009

Daily Update for June 24th

Mostly coverage work today. Wrote several XML files for all the verification tests. I'm also suspecting that I've run into a bug with coverage.py because of some strange holes in the reported coverage. Investigation will follow.

Tuesday, June 23, 2009

Weekly update for June 23rd

This week most the work was on the DOM parser with a fix of the sorting method or ListProxy.

The parser now has more complete verification and several test files. Parameters are mostly implemented now, but are still lacking some of the semantic logic to tie them together to the packets. This should be remedied in the next few days, followed by work on Client/Server/common.

Tuesday, June 16, 2009

Weekly update for June 16th

This week I pulled 100% coverage of structures and 99.5% coverage of xstruct, as well as finishing up the black magic of structures (Group and List.)

I'm on to Parser, where I have unit tests for the parts that I can confirm work (pretty much everything except parameterlists and enumerations.) Being sick the last few days has robbed me of the mental capacity to add the missing functionality to Parser, so I bided time by rewriting Parser in DOM rather than SAX. The DOM parser has enumeration support, but no paramatersets yet.  I am waiting on mithro's approval before I decide whether or not to replace the SAX.

Sunday, June 14, 2009

Work for June 13th

Fixed up parser's imports so it runs. It has no support for parametersets, though. I also discovered that protocol.xml was buggy and grabbed a new version from the documents repository. Also found a couple instances of eval() to get members of modules. Replaced them with getattr.

Thursday, June 11, 2009

Work for June 11th

I think ListProxy now supports every list operation (I know I'm going to be shown wrong soon). Structures now has 100% test coverage, and xstruct has just 1 uncovered line related to traceback formatting.

Tomorrow I should be starting on parser.

Wednesday, June 10, 2009

Work for June 10th

Most of today's work consisted of looking at segments of xstruct and structures that the unit tests haven't touched and finding a way to get them covered.

xstruct is now down to six uncovered lines, and structures down to 4, for an aggregate 98% coverage. Said lines are either very difficult/impossible to hit or of questionable utility, so I will seek mthiro's wisdom before proceeding.

Tuesday, June 9, 2009

How structures works

structures contains much more deep magic than I was expecting when I started on it.  For the benefit of anyone who might be working on it later, or the curious, I will try to explain

Basics

The structure classes all implement type-safe wrappers for data types. What this means for you is if your object has a StringStructure member, then  you can pretty much treat it just like a string. You can plug it in wherever you would need a string and you can set it to any string value. The difference is if you try to set it to a non-string value an exception will be thrown. In libtpproto2-py this is used to catch wrong values for types as soon as possible rather than when we attempt to send them over the protocol.

How It Works

To achieve these properties we essentially have to change how Python acts when a Structure is get or set on an instance of a class. Fortunately for us, Python provides tools for doing just this. The Structure classes all override the __get__ and __set__ methods. These methods are called when a member of a class is read from or written to respectively.  Each Structure stores a special variable in the class that contains the value it is storing. So, if have a StringStructure named "string" in an object A" and write to it, it first checks that the new value is a legal string. If this value is legal it stores it in A.__string. Likewise, when A.string is accessed, it actually returns A.__string.

Group and List

The above method works great for integers, times, and strings. This is because they may directly into an immutable Python type.  Group, however acts more like an object, having mutable members, and List acts like a list of groups.

Suppose , for instance, an object A has a group g with a string s, and g returns a regular object with its members then:

A.g.s = "string"

Would modify s in the returned object, but this would not be reflected in A. Thus these mutable types need to return a more sophisticated object. We call this a Proxy.

Groups and their Proxies

A GroupProxy is created with awareness of what object it is a member of and what GroupStructure it proxies. It overrides __getattr__ and __setattr__ methods. These methods function like __get__ and __set__, but they belong to the parent object rather than the accessed child. These are overloaded to find the corresponding Structures within its Group.

Because it is relatively easy with Groups for Structures at different levels to share a name, we use a more complex naming scheme. In the example before A.g.s stores its values in A.__g_s rather than A.__s. This, combined with each Group ensuring that no two members share a name, means any Group is guaranteed to have unique qualifiers for every child.

The way Group achieves this is whenever it receives a name, for every child: it takes its own name, append an underscore, then adds the child's original name (stored as id in all Structures), and sets that as the child's new name. Notice that if any children are Groups this will propagate all the way down.  This is important since the children will exist before the parent, since they are passed to the parent's constructor. It's also important to use the child's original name or g.g.s's name will be "s", then prepend "g_" for "g_s", then prepend "g_g_" for "g_g_g_s".

Lists and their Proxies

Naming threatened to get even trickier with lists. Suppose We have a list, A.l where every element of the list is a string. Using groups method, a list of strings would all want to store their special data somewhere like A.__l_s, which would mean they would all end up with the same value. Rather than try to work the index into the name as well (and force some name propagation nightmares) Everything is stored under A.__l. A.__l is a list of objects which have only one member: a GroupStructure named "group".  The GroupStructure will contain whatever is stored in the list.  Everything in Group i will be stored in A.__l[i] and not have to worry about stepping on anyone's toes.

ListProxies try to act as much like Python lists as possible. Rather than overriding __getattr__ and __setattr__ they override __getitem__ and __setitem__, which behave equivalently with indexing items in a list. Thus, when ListProxy.__getitem__(i) is called, it returns A.__l[i].group. This means it returns a GroupProxy, and can then access the members of the list following the rules set in GroupProxy.

Shortcomings

This method does its job well enough, essentially pushing strict typing on a loosely typed language without muddling the syntax, but it's not perfect. The biggest fault I can find with it right now is underscores in names can cause havoc. A structure named "getattr__" will overwrite its parent object's __getattr__ method and cause an exception to throw any time a member of the parent is read. Likewise clever usage of underscores could cause an object to share a qualifier with the child of  a group at its level. The answer for this within my project is that Thousand Parsec's protocol isn't trying to destroy its applications, so this shouldn't come up. This is not a good answer, however. More practical solutions could mean restricting underscores from structure names, or finding somewhere else to store the data. For now I'll leave that for future generations to ponder.

Plans for June 16th

Now that the  foundational modules are well tested I'll be digging into the meatier parts of the project. For the next week I'll be working on the parser module, which reads protocol.xml and generates protocol information; essentially generating the protocol API at run time.

Update for June 9th

Work on the structures package has been mostly completed now.  The unit tests all check out and cover most of the code (about 70-80% by my estimates). I will be expanding that coverage as the project continues.

Tuesday, June 2, 2009

Plans for June 9th

Now that the  xstruct module is "done" (as much so as can be said of any code 2 weeks into a project), following Mithro's suggestion I'll be working on the structures module. This is something of a higher-level wrapper for xstruct. It should also be the level at which the data transmitted with the protocol will be defined at.

I expect this will be similar work to what I did for xstruct. Most the functionality is already there, so I'll be focusing on rigorous unit testing and fix any broken or missing functionality I find while testing.

Update for June 2nd

This week I worked on the xstruct module. I wrote unit tests for pack and unpack that:

  • Ensured all types in the documentation could be packed and unpacked to get the original value.
  • Ensured pack generated the expected string from a tuple, and unpack generated the expected tuple from a string.
  • Ensured  an exception was thrown for type mismatches.
  • Ensured an exception was thrown when an integer did not fit within its type.

In doing this, I corrected bugs that disallowed the 'c', 'f', and 'd' (single ASCII character, float, and double) types from being used. I also added bounds checking for all the integer types, as previously the module only checked that unsigned types did not have negative values.