With the increased availability of parallel computer resources amenable to large scale scientific computing, it is essential to optimize the use of these resources. The efforts include the development of parallel computing techniques suitable for macromolecular simulation and the development of a parallel computer cluster and related software for high-efficiency simulations at low cost. - The LoBoS project (Lots of Boxes on Shelves) - Massively Parallel Computing using Off-The-Shelf Personal Computers LoBoS I (Lots of Boxes on Shelves) has been completed using commodity PCs to provide a greater than 10-fold improvement in price/performance when compared with the primary supercomputer vendor's offerings. This approach has been very successful with 128 processor LoBoS I and LoBoS II is now being developed with faster computers and gigabit ethernet switches to provide rapid communication with an improved network topology and communications paths. Cluster management and queuing software continue to be refined and have been distributed to 5 sites. The LoBoS project has produced the fastest computational system at the NIH. It opens up a new realm of high performance computing which continues to drive the cost down while improving reliability. This system is created using commodity priced Intel based PCs. The system has been designed to survive all "single point of failure"problems. Development of methods and software to make productive use of general parallel machines for use in macromolecular simulations is still an ongoing activity. The global communication approach, has been successful in providing an efficient full feature version of CHARMM. This parallel version of CHARMM has been extended LoBoS to run on almost any MIMD parallel computer platform. Our current development effort involves a scalable algorithm that promises to greatly reduce the communication cost for very large MPP machines or for large workstation clusters. Current projects include: - Development of parallel QM/MM methods - Development and support of parallel CHARMM - Development of a 10 teraflop GRAPE computer (at RIKEN) for macromolecular simulation - Development and evaluation of scalable parallel algorithms for molecular dynamics - Development of an efficient communication scheme for 3 dimensional FFT calculations - Development of Latency-tolerant algorithms for Parallel Computing LoBoS' off-the-shelf components include Fast Ethernet network components which has a high latency (~100ms). Consequently, most algorithms have a practical limit of 8 to 32 nodes before efficiency becomes unacceptably low. This work utilizes the Fire Brigade method for distributing data as it becomes valid, thereby avoiding most of the latency issues.