Investigation is ongoing into the biochemical and functional properties of a plasma sialoglycoprotein of 120 kDa (sgp120-A) that was first identified by isolation (3-5% yield) with the second component of human complement (C2) on "C4b"-Sepharose. The capacity of sgp120 to bind to the galactosyl-specific lectin, Jacalin was used to purify sgp120 at up to 40% yield from plasma which contained both binding and primarily non-C4b binding (sgp120-I) forms. Although the two forms are immunochemically indistinguishable by double diffusion analysis, by a number of criteria these two forms are distinct and in particular sgp120-A possess most of the described classical complement pathway inhibitory activity. We have fully sequenced clone A that produces a fusion protein detectable by monospecific antibody to sgp120. We are in the process of completing the sequence of larger identified clones B=2.0 and C=2.1 kB. Clone D=2.2 kB was fully sequenced and resulted in an open frame translation of 729 amino acids. Fast A database search of all protein sequences for clone D translation confirmed our earlier conclusions of the unique structure of sgp120. The results of this latest search of 113000 sequences, however, did find significant homology (41% identity in a 353 amino acid overlap) with human Inter- alpha-trypsin inhibitor. We have essentially sequenced clones B and C and like clones A and D these also contain the 998 bp 3'-terminal of the sgp120 gene. Like clone B, clones C and D also contain the N- terminal 15 amino acid sequence identified in the 25 kDa peptide as well as the 16 amino acid sequence found in the 35 kDa peptide obtained by kallikrein digestion of plasma purified sgp120. To further search for full length clones of sgp120 we have utilized a probe prepared from the 5' end of clone D. This probe in conjunction with a 1 kB probe prepared from clone A identified two clones 7-1, G & 8-1, F. Preliminary results obtained from sequencing isolated sgp120 cDNA from clones F and G extend the known base sequence of clone D from 2189 bp to 2270 (81 bp) and to 2404 (215 bp), respectively. Bestfit analysis of the open reading from of clone G to the N-terminal of sgp120 did not identify the N-terminal sequence. Thus, a full length gene for sgp120 still needs to be identified to obtain the entire coding sequence that should minimally be between 2.6 to 3.0 kB. Analysis of the internal base sequence of sgp120 cDNA from clones B to G has identified an internal splice region accounting for an insertion of up to 35 amino acids within D clone (residues #364-388), 30 amino acids within B & G clones and no insert found in C & F clones. This information suggests a rationale for the two forms of Sgp120 described earlier.