Infection and cardiovascular disease are two main sources of mortality in the dialysis population. Even though acute infections have been associated with an increased risk of myocardial infarction and stroke in the general population, the extent to which infection is a contributing factor to increased risk of cardiovascular events longitudinally in the dialysis population is largely unknown. The largest source of research data for the dialysis population is the United States Renal Data System database, which contains hospitalization records of nearly all patients on maintenance dialysis. Our long-term goal is to study the dynamic association of cardiovascular events and various contributing risk factors, particularly infection. Towards this goal, we will develop generalized semiparametric regression models to study trends over time generally, over time (years) on dialysis and over age, specifically. Determining the age- and time-dependent association between infection and the occurrence of cardiovascular events and obtaining the predicted subject- specific risk trajectory (probability) of cardiovascular events based on predictors, for instance, from the previous one to three months (i.e., time-lagged prediction) are critical steps towards the development of targeted intervention strategies in the US dialysis population. Innovation. The main challenge towards this goal is the lack of methods able to handle the extreme/ challenging structure of the longitudinal data available for analysis, characterized by extreme- (ultra-) sparsity, unsynchronized measurements, and imprecision/measurement error. This results from data collected on patient hospitalization records, which is extremely irregular and infrequent. In addition, longitudinal clinical inflammatory markers data (available for a subset of the USRDS cohort) are at unsynchronized time points with the outcome, possibly contaminated with measurement error. Currently there are no existing methods for generalized semiparametric regression modeling of longitudinal binary outcome (e.g., occurrence of cardiovascular events) or modeling of count/rate outcome that can handle 1) irregular, 2) infrequent, 3) unsynchronized and 4) error-prone longitudinal data. Aims. The proposed research will fill this gap, by developing new estimation &inference procedures for generalized semiparametric regression models (GSRMs) for longitudinal data under these emerging challenges using functional data analysis (FDA). This will be achieved through the following specific aims: 1) Develop a unified functional analysis framework for estimation and inference for GSRMs, including generalized and generalized partial linear varying coefficient models, for highly irregular, infrequent, unsynchronized and noise-contaminated longitudinal data;2) Develop methods to predict subject-specific response trajectories;3) Characterize the efficiency of our proposed FDA approach. Furthermore, these methods will be used to determine, for the first time, the cardiovascular-infection risk longitudinal dynamics in the dialysis population. PUBLIC HEALTH RELEVANCE: The public health burden directly related to infection and cardiovascular disease in the dialysis population is substantial. The proposal involves developing the necessary estimation and inference framework to use the United States Renal Data System database in modeling age- and time-varying dynamics of the association between cardiovascular events and various contributing risk factors including infection. Understanding this cardiovascular-infection risk dynamics in patients over time is important to the development of targeted intervention strategies in the US dialysis population.