Document PSD-CODATA-0703
                         PIR Installation Document
                     For the CODATA Format Release of
 
               PIR-International Protein Sequence Database
                    Release 68.02,   May 10, 2001
                 228873 entries,   794812130 residues


              The collaborating centers of PIR-International:

                  Protein Information Resource (PIR)*
                National Biomedical Research Foundation
                       3900 Reservoir Road, NW,
                      Washington, DC  20007, USA


  Japan International Protein           Munich Information Center for
  Information Database (JIPID)             Protein Sequences (MIPS)
        Amakubo 1-16-1          GSF-Forschungszentrum f. Umwelt und Gesundheit
   Tsukuba 305-0005, Japan            am Max-Planck-Instut f. Biochemie
                                 Am Klopferspitz 18, D-82152 Martinsried, FRG


This database may be redistributed, provided that this notice be given to 
each user and that the words "Derived from" shall precede this notice if 
the database has been altered by the redistributor.

We have made every effort to ensure proper functioning of the programs 
and cannot be held responsible for the consequences to users of any 
problems encountered during their operation.


                *PIR is a registered mark of NBRF


PIR is partially supported by National Library of Medicine grant LM05798


1.0 CODATA Format
=================
This document describes the quarterly release of the PIR-International 
Protein Sequence Database in CODATA format formerly distributed on magnetic
media for non-VAX/VMS systems in fixed-length 80-byte records.

2.0 In this Release
===================
Release 68.02 of the Protein Sequence Database contains 228,873 entries
and 76,145,405 residues. The Release is separated into four datasets.
Section 1, Fully Classified Entries, contains 20,498 entries and
8,042,606 residues. Section 2, Verified and Classified Entries, contains
207,908 entries and 71,344,971 residues. Section 3, Unverified Entries,
contains 62 entries and 27,267 residues. Section 4, Unencoded or
Untranslated Entries, contains 405 entries and 66,369 residues. A total
of 32,106 superfamilies are represented in sections 1 and 2. 

3.0 Features in this Release
============================
Starting with Release 64.00 of the Protein Sequence Database, PIR-International
is including status information in protein titles, function and complex records.
These new status identifiers are as follows.

[validated] = in a title or function block means that one of the references
in the entry contains some experimental evidence for the protein's function.

[similarity] = in a title or function block means that the name and/or 
function has been assigned by end to end sequence similarity with other 
entries that have that same name or function.

[imported] = in a title means that the name was imported with the sequence from
GenBank, EMBL DDBJ, or other source and has not been verified by PIR.

Complete coverage of the entire database will not be obtained for several
releases.  The absence of a status identifier at this time should NOT be taken
as an indication that the information in the title or function blocks is not
correct or has not been evaluated by PIR staff.