Sunday, October 05, 2014

Recreational programming: zipper augmentations

I rarely program professionally, so when I write software, it is in the spare moments I have between home and and work (theses days providing consultancy services). That really means on the bus and local tram and subway network here in Zurich.

I do not program for the fun of making my executables run; I mean, for the fun of having the program really do something useful. I program for the fun of the exploration! What I enjoy is understanding how to use new concepts. Usually these are in specific research domains,  which unfortunately for this blog, are still to remain unpublished, but the last two weeks have had me travel into more classical mathematical and software territories, which allows me to talk about them here.

The way research goes, is that you need take leaps of faith. There are lots of territories to explore, you have these gut feelings that tell you to go in one direction and in other. Yet each of these explorations will take time and effort, leaving you with no other choice than to do nothing with your feelings. Yet too much unexplored feeling will make you anxious. Therefore from time to time, you need to take the risk of following up on your desires, even if you do not know exactly where they will lead you.

This was the case for myself two weeks ago, I was feeling nostalgic of KD-Trees. I could not know for sure if part of me wanted to revisit them because my first (summer) job as a developer had me adding these tree structure to an electronic CAD program, or because there was more to find in revisiting them.  The only way to find out was to put in some real work. I therefore quickly implemented a KD-Tree for rectangles (it uses a four dimensional search space), in F# under Xamarin on my macbook.

Here I make a small parenthesis:  F# on Xamarin for iOS is great. Xamarin comes out with an update every week or two, and they keep it one of the most productive environment that I know of to develop mobile apps on.

Back to the trees. KD-Trees have the fancy property of allowing you to make quick proximity search with them (e.g. to find the nearest neighbor to a given point). Implementing them functional style allowed me to understand how to generalize this notion of  tree augmentation within the concept of zippers and focuses. A very nice and powerful concept. I try to explain this hereunder.
  • Take a search tree. 
  • Augment the tree
    • Tree augmentations are secondary key like values that are added to each tree node and that only depend depend the local subtree values. 
    • For example, in KD-Trees, you can store per node a bounding box of all elements stored at that node and below it. 
  • Now create a zipper for the tree
    •   A zipper is structure that breaks recursive structures, allowing you to serialize an access to them
  • Make a focus for this tree and zipper
    •   A focus allows you to travel up and down a tree with immutable data structures. It is a list of zippers and a node of the tree. The list of zippers is used to "go back up the tree".
    • See Huet paper for a good description of the focus/cursor/location.
  • Augment the zippers to complement the augmentations of the tree.
    • For example, in the case of the KD-Tree with nodes augmented with bounding boxes, the zippers can be augmented with a "bounding hole" that is what space around the current subtree search is known empty.
    • This is what my excursions into KD-Tree taught me: you can augment the zipper to complement the tree augmentation.
  •  Now create a map function that goes over the tree with a focus. With both the tree nodes and the zipper are augmented, the map can do proximity searches for each element of the tree in a record time.
  • Finally, wrap up this mapping function in a zipper comonad because it is cool! 
    • I tried to find a good zipper comonad example off the web to give you a reference but could not find any clean and obvious code. So just search for zipper comonad. 
(A note here that there seems to be a lots of variations on terminology. Some people use the term zipper to describe the focus, while others use the terms cursor, location, or context instead of focus. I just know that the first example I read about used the terminology I used above).

Finally, some people have used multiple KD-Tree to do parallel searches, but only partial search in each tree (example). Such a parallel "sampling" approach is a way to reduce the "skewing" effect that that each individual tree hierarchy has on the tree and zipper augmentations.

No comments: