Computer Science Department Events
There will be a PhD Defense on Thursday May 3rd at 11:00 AM by Kamal Al Nasr
THURSDAY MAY 3rd E&CS Building RM 2120TIME: 11:00 AMTITLE: de novo Protein Structure Modeling From Cryoem Data Through A Dynamic Programming Algorithm In The Secondary Structure Topology GraphSPEAKER: Kamal Al NasrAbstract:
Proteins are the molecules carry out the vital functions and make more than the half of dry weight in every cell. Protein in nature folds into a unique and energetically favorable 3-dimensional structure which is critical and unique to its biological function. In contrast to other methods for protein structure determination, Electron Cryo-microscopy (CryoEM) is able to produce volumetric maps of proteins that are poorly soluble, large and hard to crystallize. Furthermore, it studies the proteins in their native environment. Unfortunately, the volumetric maps generated by current advances in CryoEM technique produces protein maps at medium resolution about (~6 to10 Ã…) in which it is hard to determine the atomic-structure of the protein. However, the resolution of the volume maps is improving steadily, and recent works could obtain atomic models at higher resolutions (~3 Ã…).
de novo protein modeling is the process of building the structure of the protein using its CryoEM volume map. Thereupon, the volumetric maps at medium resolution generated by CryoEM technique proposed a new challenge. At the medium resolution, the location and orientation of secondary structure elements (SSE) can be visually and computationally identified. However, the order and direction (called protein topology) of the SSEs detected from the CryoEM volume map are not visible. In order to determine the protein structure, the topology of the SSEs has to be figured out and then the backbone can be built. Consequently, the topology problem has become a bottle neck for protein modeling using CryoEM. In this dissertation, we focus to reduce the large topological space quickly to a small subset of possible topologies without the use of energy evaluation. The goal is to include the true topology in such a subset, so that the conformations can be built for the likely topologies for further evaluation. In order to generate the small subset, the problem is translated into a layered graph representation. We developed a dynamic programming algorithm (TopoDP) for the new representation to overcome the problem of large search space. Our approach shows the improved accuracy, speed and memory use when compared with existing methods. However, the generating of such set was infeasible using a brute force method. Therefore, TopoDP provides a fast geometrical method of ranking the topologies without the construction of the protein chain that is often time consuming.
A novel approach to derive the structure was proposed in this thesis. The first step is to reduce the topological space using the geometrical features of the secondary structures through a constrained k-shortest paths method in the topology graph. The second step is to model the helices and loops using the trace from the density map through a forward-backward CCD.
|