next up previous contents index
Next: B.1 Formal description of Up: Harvest User's Manual Previous: A.9 $HARVEST_HOME/lib/gatherer

B The Summary Object Interchange Format (SOIF)

 

Harvest Gatherers and Brokers communicate using an attribute-value stream protocol called the Summary Object Interchange Format (SOIF), an example of which is available here. Gatherers generate content summaries for individual objects in SOIF, and serve these summaries to Brokers that wish to collect and index them. SOIF provides a means of bracketing collections of summary objects, allowing Harvest Brokers to retrieve SOIF content summaries from a Gatherer for many objects in a single, efficient compressed stream. Harvest Brokers provide support for querying SOIF data using structured attribute-value queries and many other types of queries, as discussed in Section 5.3.

  To see an example of a SOIF summary stream, you can run the gather client program, as discussed in Section 4. When you do, you'll see output like this:

        @DELETE { }
        @REFRESH { }
        @UPDATE {
        @FILE { ftp://ecrc.de/pub/ECRC_tech_reports/reports/ECRC-93-10.ps.Z
        Time-to-Live{7}:    9676800
        Last-Modification-Time{9}:  774988159
        Refresh-Rate{7}:    2419200
        Gatherer-Name{50}:  Computer Science Technical Reports - Selected Text
        Gatherer-Host{21}:  bruno.cs.colorado.edu
        Gatherer-Version{3}:    0.3
        Type{10}:   Compressed
        Update-Time{9}: 774988159
        File-Size{6}:   164373
        MD5{32}:    43193942d4d53f5a8e4a7b4bcff7a415
        Embed<1>-Nested-Filename{13}:   ECRC-93-10.ps
        Embed<1>-Type{10}:  PostScript
        Embed<1>-File-Size{6}:  428233
        Embed<1>-MD5{32}:   84c123582c3d0754a39a78a7e2fb6d23
        Embed<1>-Keywords{105}: 
        technical report ECRC{93{10
        Polymorphic Sorts and Types for
        Concurrent Functional Programs
        Bent Thomsen
    
        }
    
        @FILE { ftp://cml.rice.edu/pub/reports/9404.ps.Z
        Time-to-Live{7}:    9676800
        Last-Modification-Time{9}:  772872313
        Refresh-Rate{7}:    2419200
        Gatherer-Name{50}:  Computer Science Technical Reports - Selected Text
        Gatherer-Host{22}:  powell.cs.colorado.edu
        Gatherer-Version{3}:    1.0
        Type{10}:   Compressed
        File-Size{6}:   240015
        Update-Time{9}: 772872313
        MD5{32}:    1712ce5a973cfbb0508b405d6fef1669
        Embed<1>-Nested-Filename{7}:    9404.ps
        Embed<1>-Type{10}:  PostScript
        Embed<1>-File-Size{6}:  488770
        Embed<1>-MD5{32}:   84b748dbdda572a1fb1d9c3f67a2dda9
        Embed<1>-Keywords{5135}:    /dsp/local/papers/spletter94/spletter94.dvi
    
        Submitted to: IEEE SP. Letters - May 1994
        NONLINEAR WA VELET PROCESSING FOR
        ENHANCEMENT OF IMAGES
        J.E. Odegard, M. Lang, H. Guo, R.A. Gopinath, C.S. Burrus
        Department of Electrical and Computer Engineering,
        Rice University, Houston, TX-77251
        CML TR94-04
        May 1994
    
        NONLINEAR WA VELET PROCESSING FOR ENHANCEMENT OF IMAGES
        J.E. Odegard, M. Lang, H. Guo, R.A. Gopinath, C.S. Burrus
        Department of Electrical and Computer Engineering,
        Rice University, Houston, TX-77251
        CML TR94-04
        May 1994
        Abstract
        In this note we apply some recent results on nonlinear wavelet analysis
        to image processing. In particular we illustrate how the (soft) 
        thresholding algorithm due to Donoho [2] can successfully be used to 
        remove speckle in SAR imagery. Furthermore, we also show that transform
        coding artifacts, such as blocking in the JPEG algorithm, can be removed
        to achieve a perceptually improved image by postprocessing the 
        decompressed image.
        EDICS: SPL 6.2
        Contact Address:
        Jan Erik Odegard
        Electrical and Computer Engineering - MS 366
        Rice University,
        Houston, TX-77251-1892
        Phone: (713) 527-8101 x3508
        FAX: (713) 524-5237
        email: odegard@rice.edu
    
        1 Introduction
        We consider the problem of noise reduction by nonlinear wavelet processing. 
        In particular we focus on two applications of the recently developed theory related 
        to wavelet (soft) thresholding [2]. The model
    
        [...rest deleted...]

The ``@DELETE'', ``@REFRESH'', and ``@UPDATE'' commands are part of the Broker's Collector interface (described in Section 5.9), which provides an additional command level on top of SOIF. Currently, only the @UPDATE section is implemented. Within the @UPDATE section you can see individual SOIF objects, each of which contains a type, a Uniform Resource Locator (URL) [2], and a list of byte-count delimited field name -- field value pairs. Because the fields are byte-count delimited, they can contain arbitrary binary data. Note also that SOIF allows Embed fields, corresponding to layers of unnesting when summarizing objects (unnesting from a Compressed PostScript to PostScript file above).

     

SOIF is based on a combination of the Internet Anonymous FTP Archives (IAFA) IETF Working Group templates [11] and BibTeX [17]. Unlike IAFA templates, SOIF templates support streams of objects, and attribute values with arbitrary content (spanning multiple lines and containing non-ASCII characters).

In time we will make a specification for SOIF (and all of Harvest) available, which defines a set of mandatory and recommended attributes for Harvest system components. For example, attributes for a Broker describe the server's administrator, location, software version, and the type of objects it contains.

 




next up previous contents index
Next: B.1 Formal description of Up: Harvest User's Manual Previous: A.9 $HARVEST_HOME/lib/gatherer



Darren Hardy
Mon Apr 3 15:22:37 MDT 1995