nfrefs.dat includes the NREF entries that contains REFSEQ sequence but not in iProClass. Each entry starts with ">" and NF ID and follows some of fields listed below: Title =Protein name Species = Common =Common name (for Species) TaxId =Taxonomy ID (species level) Lineage =Taxonomy lineage SourceOrg=TaxId:Source Organism Biblio =PMIDs GE =GenPept-- ID^|^Title^|^Organism^|^TaxonId^|^AC# PD =PDB-- ID^|^Title^|^Organism^|^TaxonId^|^AC# RE =RefSeq-- ID^|^Title^|^Organism^|^TaxonId^|^AC# Length =Length of sequence