We were interested in determining the primary structure of RS viral capsid protein in order to define the different domains of this protein that are involved in RNA-protein interaction. In addition, we wished to understand the genetic organization of this human pathogen at the nucleotide level. Our interest in NC protein stems from a desire to understand the nucleocapsid assembly of RS virus and the various interactions between the NC protein and transcriptional enzymes. Knowledge of the primary structure of the major capsid protein should provide insight into the nature of these interactions. To achieve this goal, we have determined the sequence of this protein. A recombinant plasmid (pRSB11) was selected from a RS cDNA library using screening procedures described elsewhere. DNA sequencing of the RS viral insert by Gilbert-Maxam method revealed an RS virus sequence of 1430 nucleotides excluding the poly(A) tail. By primer extension and dideoxy-sequencing the insert was found to lack only four nucleotides corresponding to the 5' end of the mRNA. Of interest, a nine nucleotide sequence 5'-NGGGCAAAT3' was present at the 5' end of the mRNA strand. This sequence is conserved at the 5' end in all seven RS viral genes sequenced so far. The NC clone has a single long open reading frame of 467 amino acids capable of encoding a protein of 51540 daltons. There was no homology of RS NC sequence with that of VSV, influenza virus, coronavirus and TMV capsid proteins implying that RS virus is evolutionarily distinct.