Abstract Cryo-Electron Microscopy (cryo-EM) is a powerful technique produces volumetric images of large molecules. The images produced at near-atomic (<5) resolution can be used to determine the structure of those molecules. Due to experimental difficulties, only small portion of the images are produced at near-atomic resolution while the dominant number of available images is produced at sub/nanometer resolution. At subnanometer (5-10) resolution, the backbone of the structure cannot be constructed directly from those images. Nevertheless, de novo modeling can be used to derive the atomic structure of the molecules. The detection of secondary structure elements (helices and sheets) from the volumetric images is crucial for de novo tools. Moreover, the observation of the structural connections between these elements is extremely helpful in order to answer the topology determination problem. Topology determination problem can be defined as the correspondence between the secondary structure elements found on the sequence of the protein molecule and those found on its cryo-EM volumetric image. This problem is proven to be NP-hard. In this project, a complete de novo system that is capable of efficiently deriving the structure of large molecules from the authentic cryo-EM volumetric images will be developed. The proposed de novo system is divided into a number of components each of which will address an important sub-problem. De novo modeling will be accomplished by three main sub-systems (1) extracting a fine-quality skeleton of the molecule from its noisy cryo-EM image (2) addressing the topology determination problem and (3) building the atomic structure of the target molecule. The extracted skeleton of the protein molecule will be used to confront the problem of topology determination so that the search space size can be drastically reduced. In addition, the skeleton will be used to construct the structure of the molecule. The system will be evaluated using a benchmark of authentic and synthesized images. The novel algorithms that will be developed in the proposed de novo system are significant to various fields. For example, the proposed algorithm of skeletonization can be used by biomedical sciences that use the skeletons. Moreover, the dynamic matching algorithm of this research can be generalized to be used for similar matching problems in any field of study. Further, the proposed system will help understanding the fundamental functions of some protein types such as membrane proteins and large protein molecules that are hard to study using the traditional experimental techniques. This project will bring the effort and skills of the collaborators, Dr. Kamal Al Nasr, Dr. Wei Chen, and Dr. Matthew Hayes from the Department of Computer Science at Tennessee State University (TSU), and the consultant, Dr. Montserrat Samso from Virginia Commonwealth University (VCU), with the undergraduate and graduate students. The principal investigators will be responsible to develop the proposed algorithms and carry out the design and analysis part of the project. The students will be exposed to bioinformatics research and they will gain hands-on experience with some important bioinformatics problems. The students, with the guidance of the principle investigators, will collect data, implement the proposed algorithms, and analyze the results. Further, this project will help TSU prepare a strong workforce of minority students who can compete with their peers in industry or academia in various areas of bioinformatics. The proposed activities will enhance integration of research and education in biology and computer science and it is expected that undergraduate students will be more motivated to pursue career in medical fields.