About tab-delimited output
The principal machine-readable data format supported by the USGS Water Data for the Nation site is a variant of a tab-delimited ASCII file structure called "rdb". The rdb file structure consists of a header section containing zero or more comment lines. The rdb header contains important information such as disclaimers, sites, parameter and location names. The header is followed by exactly one tab-delimited column-name row, which is followed by exactly one column-definition row, and a data section consisting of any number of rows of tab-delimited data fields. The header comment lines start with a sharp sign (#) followed by a space character followed by any text desired. The fields in the tab-delimited column-name row contain the names of each column. The fields in the tab-delimited column-definition row contain the data definitions and optional column documentation for each column. Data rows must have exactly the same number of tab-delimited columns as both the column-name and column-definition rows. Null data values are allowed. Example rdb file:
# ------------------------------------------- # Documentation lines. These describe and # identify the rdb file contents. # ------------------------------------------- NAME COUNT TYP AMT OTHER RIGHT 6s 5n 3s 5n 8s 8s Bill 44 A 133 Another This John 44 23 One Is Gary 77 77 Here On Mar 77 B 244 And The Greg 77 D 1111 So Right
As a general rule, we discourage users from relying on data being in certain positions. In our Automated Retrieval FAQ, Automated retrievals, we suggest that when reading (parsing) NWISWeb rdb files, it is important to first parse the column-name row (the first non-comment row in the file) to determine the column position of each data value as different sites may return data columns in a different order. Information detailing the column-name syntax of NWISWeb data file is contained in the header comments of the file for each data type.
Water Data for the Nation output-file format
Tab-delimited data files output by the Water Data for the Nation site consist of tab-separated columns of data for one or more sites. Each site is separated by a header section of comments and new column definitions. The following 3 column definitions are always included for each site:
Column Definition ---------- ----------------------------------------- agency_cd Agency collecting data or maintaining the site site_no USGS site-identification number datetime Date (and time for real-time data) in ISO format
The remaining pairs of columns vary for each site depending on whether real-time or daily data are being output and on which data parameters were selected.
The alphanumeric groupings define the width of the column and the type of data that it contains. For example, 5s indicates that the column is 5 characters wide and contains data of the type "string". Letter designations are as follows:
d = date
m = month
n = numeric
s = string
Further information regarding the USGS use of the rdb format can be found here.
For current condition data the data-column pairs use the format 'nn_nnnnn' 'nn_nnnnn_cd' where the first two-number sequence in each column name uniquely defines the sensor (the 'data descriptor') used to collect the data and the following five-number sequence defines the 'parameter_cd' which describes the type of data shown in the column. The second 'nn_nnnnn_cd' column in the pair contains data-value qualification codes pertaining specifically to the preceeding column.
For daily data the data-column pairs use a similar format of 'nn_nnnnn_nnnnn' 'nn_nnnnn_nnnnn_cd' where the first two-number sequence in each column name uniquely defines the sensor (the 'data descriptor') used to collect the data, the next five-number sequence defines the 'parameter_cd' which describes the type of data shown in the column, and the next five-number sequence describes the type of daily statistic used to calculate the daily data value. The second 'nn_nnnnn_nnnnn_cd' column of the pair contains data-value qualification codes pertaining specifically to the preceeding column.
A list of specific parameter codes, statistic codes (if daily data), and data-value qualification codes (if present) are included in the header for each site. For example:
# Data for the following station(s) are contained in this file # ------------------------------------------------------------- # USGS 06041000 Madison River bl Ennis Lake nr McAllister MT # # Available data at this site--lines with asterisk '*' are included in this output. # DD parameter statistic - Description # -- ----- ----- ------------------------------------ # *02 00065 00003 - Gage height, feet (Mean) # *05 00010 00001 - Temperature, water, degrees Celsius (Maximum) # *05 00010 00002 - Temperature, water, degrees Celsius (Minimum) # *05 00010 00003 - Temperature, water, degrees Celsius (Mean) # *06 00060 00003 - Discharge, cubic feet per second (Mean) # # Data-value qualification codes included in this output: # A Approved for publication -- Processing and review completed. # P Provisional data subject to revision. # e Value has been estimated.