Information Microstructure, Key to Simplicity Ed Lowry Advanced Information Microstructures 14 Old Village Road Acton Mass 01720 508 263-3508 eslowry@mcimail.com What is a reasonable structure for a data object, and why? Anyone who designs information systems without a sound answer to that question risks producing systems which are unreasonable in much the same way that square wheels are unreasonable. Foundational knowledge in most fields of science and technology consists largely of knowledge about fine structure of the subject matter. In information technology, understanding of how the quality of information is affected by the fine structure of its representation has been neglected. One result has been pervasive excess complexity, but it can be corrected. Designing a formal language for maximum simplicity of expression leads to the simplest kind of data object, the unlabeled directed arc or simple pointer. It can be shown that: For sufficiently large deterministic languages of a given size, those which provide maximum simplicity of expression across any sufficiently evolving set of applications must use data objects which are unlabeled directed arcs exclusively. Complex data objects can be decomposed to produce additional objects representing useful abstractions which contribute to overall simplicity. The conclusion applies broadly to technical descriptions prepared either for people or machines. Arcs are theoretically optimum for large languages and rich applications but empirically optimum almost everywhere. There are about 25 similar cases where optimizing an engineering value leads to a simple irreducible structure rather than a tradeoff. Those include: round wheels, tubular pipes, binary memory elements, vertical pillars, etc. Merging functions on directed arcs for many application domains into one language can eliminate restrictions on the generality of formal language semantics. Knowledge workers succeed by working CAREFULLY with various kinds of information. Learning to do so pervades education. Careful work of almost any kind requires understanding and control of fine structures. It is proposed that development of the Internet and any ongoing information system be designed using arcs to model the whole system and the information in it including legacy structures to maximize simplicity and ease of large scale integration.