Of possible interest to digitizationblog readers from yesterday's Freshmeat Newsletter: rtfx, a simple but functional RTF-to-XML converter, and an application called Document Archive, a document-centered content management system with some neat features such as multiuser/group document management functions, hierarchical document classification, metadata and fulltext searching, and optional integration with Mozilla/Firefox using XUL.
What sounds like a major breakthrough for digitization of historical materials: the Center for Intelligent Information Retrieval at the University of Massachusetts Amherst has developed technology that searches handwritten documents. The technology "learns from a parallel body of transcribed scanned images," and "[o]nce the model is learned it may be used for searching scanned pages for which no transcriptions are available".