Introduction


.c2.Overview

At the National Center for Supercomputing Applications, diverse 
groups are working on supercomputing projects of many kinds. 
The Hierarchical Data Format (HDF) was designed to make the 
sharing of data between different people, different projects, and 
different types of computers easy and self-describing. An 
extensible header, along with carefully crafted internal layers, 
provides a system that can grow along with the software that we 
develop. This chapter provides a brief overview of HDF 
capabilities and design.


What Is HDF?

HDF is a growth-oriented approach to file format design. Rather 
than try to address all of the short term issues in a fixed format, or 
attempt to solve all of the hard problems in an all-purpose format, 
HDF defines a framework for continued future growth. New 
calling interfaces can be defined that are compatible with old ones. 
Files can be made upwardly compatible for years to come without 
giving up added power in the future. HDF also makes it easy for 
the user to include annotations, titles, and specific descriptions of 
the data in the file, so that files can be archived with human-
readable information about their origins.

HDF uses the concept of a tagged, or object oriented, file 
organization. The idea is to store both a known format description 
and the data in the same file. HDF tags describe the format of the 
data because each tag is assigned a specific meaningÑone tag is 
assigned to "File Identifier," another is assigned to "Raster 
Image," and so on (see Figure 1). A program that has been written 
to understand a certain list of tag types can scan the file for those 
tag types and process the data. This program also can ignore any 
data that is beyond its scope.


Figure 1	Raster Image Sets in an HDF File

                                                 
HDF files never need to become out of date. For example, suppose a 
site falls far behind in the HDF standard, so its users can only 
work with the portions of the specification that are 3 years old. 
Users at this site might want to import files from NCSA. Even with 
the more advanced data files, they can list the types of data in the 
file. All of the older tag types that they understand are still 
accessible, despite the fact that they are mixed in with new kinds of 
data. In addition, if the more advanced site uses the text annotation 
facilities of HDF effectively, the files will arrive with complete 
human-readable descriptions of how to decipher the new tag types.

To present a convenient user interface, made up of something other 
than a list of tag types with their associated data requirements, 
HDF supports multiple calling interfaces. The low level calling 
interface for manipulating tags and raw data is designed to be 
used by systems programmers who are providing the higher level 
interfaces for applications like raster image storage or scientific 
data archiving.

An important issue in data file design is that of machine 
independence or transportability. The HDF design is not machine 
independent, but it defines the data completely. HDF requires you 
to fully specify all number types used, so conversion programs can 
identify what number formats are being used and do the 
conversions when needed.


i	NCSA HDF Specifications

Introduction	i

National Center for Supercomputing Applications

March 1989