Home

Teaching

Research

Education Activities

Publications

Students

Contact Info

CV

Services

Software

Links





Active Research Projects

Dr. Yaohang Li

Department of Computer Science

Old Dominion University

 


Machine Learning and Artifical Intelligence in Nuclear Physics

By collaborating with the Center for Theoretical and Computational Physics at Jefferson Lab, I become interested in developing machine learning and data analytic algorithms for nuclear physics. My research currently focuses on two projects.

1) Physical Event Generator: We are using Generative Adversarial Networks (GAN) to build AI-based Monte Carlo event generators (MCEG) capable of faithfully generating final state particle phase space. Unlike many GAN applications, such as generating realistic and sharp looking images, where the distribution agreement between the GAN-generated samples and the true ones is often not strictly enforced, GANs for generating physical events are required to model the distributions of event features and their correlations sufficiently precisely for the nature of particle reactions to be faithfully replicated. Moreover, events generated by GANs should not violate the well-known physics laws, such as baryon number and momentum conservation. We address these issues by incorporating physics into the design of the GAN architectures. The following two preprints summarize our recent work in physical Event Generator.

Our FAT-GAN package can be found at https://github.com/yaohangli/FAT-GAN.

This work was supported by the Jefferson Lab LDRD project No. LDRD19-13 and No. LDRD20-18, and in part by the U.S. Department of Energy contract DE-AC05-06OR23177, under which Jefferson Science Associates, LLC, manages and operates Jefferson Lab.

2). Nuclear Femtography: The goal of the nuclear femtography problem is to understand nucleon’s internal 3-dimensional quark and gluon structure. We are currently working a neural network architecture for solving the inverse problem.

This work is supported by Southeastern Universities Research Association (SURA) Center for Nuclear Femtography Initiative.

Randomized Algorithms for Numerical Linear Algebra

Matrix operations are the fundamental components in many data analysis and computational simulation applications. In the era of big data, many traditional numerical methods for matrix operations, designed to minimize floating-point operations, fail to scale or are incapable of handling the complexity emerging with large data sets. Due to their attractive properties including fast approximation, pass efficiency, flexible implementation, memory efficiency, and natural parallelism. randomized algorithms can nicely address a lot of these issues.

Our main works in randomized algorithms for numerical linear algebra includes:

 1) Monte Carlo linear solvers

2) Low-rank approximation

3) Pass-efficient randomized algorithms

4) Matrix Completion using randomized SVD

5) Fast verification of product of large matrices


System Biology

We are interested in applying our randomized algorithms to practical applications. One of the main applications is in system biology, due to these randomized algorithms’ capability of efficiently sampling large datasets and extracting global patterns. By treating the association matrix between biological entities as an incomplete matrix, the randomized matrix completion algorithms, coupled with other machine learning techniques, such as regularization, optimization, feature induction, or deep learning, can be used to derive the underlying unknown associations effectively.

This approach has demonstrated success, not only in accuracy improvement, but also in revealing biological patterns, in a variety of bioinformatics problems, including drug repositioning, lncRNA-disease association prediction, and miRNA-target association prediction. Here are some of our representative publications.


Protein Structure Modeling

I have an ongoing interest in understanding and modeling protein structures. The major works from our group include:

1) The discovery of a new conserved conformational cluster of phosphorylated Tyrosine sidechain structures

2) An accurate loop modeling method with subangstrom accuracy

3) A series of prediction servers for predicting protein properties with improved accuracy and reliability

 4) Parallel algorithms to fast evaluate inter-residual interactions

5) Protein strcuture modeling potentials

6) Fragment libraries

This work is funded as under NSF Computing and Communication Foundations: CAREER: Novel Sampling Approaches for Protein Modeling Applications.
CCF-0845702. National Science Foundation