New Collection Types

One of the enhancements I have been toying with in a couple languages are a new kind of collection. The principle behind them is that of adjectives. Currently in most of my code, I have adopted a few conventions such as singletons go in a "The" object which is global in scope. This collection makes it easy to talk about The.Mouse or The.Screen. It also makes it easy to avoid conflicting with 3rd party globals. I have also started putting my prototypes in "My" and the constructors in "A" and "An" which provides a simple consistency to the written text. All of this got me thinking, can I use system wide reflection to autopopulate these collections.

Think of a collection of all the "red" things in your living system. You could talk about "a red apple" where "a" is a collection of constructors "red" is a collection which filters the constructors for only those objects containing a "red" property, and "apple" would further filter based on the class of thing. You could use the same methodology to identify "a green apple" or even "The green apple tree" (assuming there was only one such thing in your world). You could also refer to all "red cars" and reason about them if you could send them messages in parallel. F-Script has a wonderful operation which allows for a message send to each element in a collection, which I've implemented before in Smalltalk by overriding doesNotUnderstand on a base collection type to simply resend to each member of the collection. To be able your problem with categorical logic is very powerful when dealing with virtual worlds. It is also very useful when you have a large number of objects scattered over a distributed system. Map/reduce programmers grasp the basics of this, but don't typically program in languages with rich enough semantics.

My favorite idea for a new collection is "some" which takes a random but statistically significant sample of a data set on which to perform an operation. A practical application is to grab "some logs" and then count the number of failures over time. On a large server installation with thousands of nodes and billions of objects, being able to statistically query your data is often more important than getting an exact measurement. If the frequency of change is too high (or on a large enough system anything greater than zero) you are likely to introduce errors by looking at a snapshot of an inconsistent state. I look at this sort of collection to be a wonderful tool for building simple diagnostics, and allow the programmer to add real world health checks to their system. Take disk IO:

some disk errors ifTrue: the administrator alert .

Wouldn't that be nice!