Email: ilyinv@bc.edu
Address: Biology Department, Boston
College, 140 Commonwealth Ave.Chestnut Hill, MA 02467
tel.
(617) 552-3540, fax (617) 552-2011, valentin.ilyin@bc.edu
1988, MS in
Molecular Biology, Moscow Institute of Physics and Technology, MPTI (fiztech).
1992, PhD in
Physics and Mathematics, Shubnikov Institute of
Crystallography, Moscow. Boris Vainstein lab. Study and work on Protein Theory,
Crystallography Theory and Experiment, PhD in Physics and Mathematics, area:
Solid State Physics, specialization in Protein Crystallography 1992.
11/1991 - 2/1995, Scientist, Group
Leader, Boris Vainstein lab, Institute of
Crystallography, Moscow, Russia. Work included several scientific projects on
protein structure analysis and development computational tools for protein
analysis. Work on Refinement methods, protein solvent accessibility, and
crystal structures for DD-carboxipeptidase and
Extracellular guanyl-specific RNAase
of the fungus Aspergillus pallidus
(RNAase ApI).
2/1999 - 5/1999, Programming
Researcher, Cold Spring Harbor Laboratory, NY.
Short-term project on development
tools for protein sequence/structure analysis.
1/1995 - 1/2000, Postdoctoral
Fellow, Research Associate, Charles Carter lab, University of North Carolina at
Chapel Hill, NC. Protein Crystallography, participated in development and
implementations on Maximum Entropy and Bayesian statistics and application to
Refinement of Protein Structures (solvent flattering), methods of
non-crystallography symmetry for refinement, X-ray structure of Tryptophanyl-tRNA synthetase and
several enzyme/substrate complexes. Refinement methods for Biomolecular Crystallography.
2/2000 – 7/2002, Research
Associate, Andrej Sali lab, Rockefeller University,
NY.
Participated in development of
homology modeling software, MODELLER, database of comparative structure models,
MODBASE, pipeline of large-scale structural genomics (>500,000 protein
models), MODPIPE, target selection for Structural Genomics, front-end
analytical and visualization tools and database interface to multiple sequence
structure resources, ModView, Bioinformatics
Databases: Ligand database, LigBase,
Structural alignments database, DBAli.
7/2002 – 6/2009, Assistant
Professor of Bioinformatics, Northeastern University, Boston, MA. Please see
the publication list, bioinformatics software and research projects
descriptions below.
7/2009
– current, Research Associate Professor, Biology Department, Boston
College, Chestnut Hill, MA
Since 2002 our projects have been
presented at over 45 posters at national and international meetings.
V. Ilyin, “An accurate structural alignment by TOPOFIT”, Broad
Institute, MIT, on Apr 28, 2009
A. Abyzov, A. Uzun,
P.Strauss, and V. Ilyin, "AP endonuclease
1- DNA polymerase beta: Theoretical prediction of interacting surfaces",
Mar 17, 2008, MIT's Boston DNA Repair and Mutagenesis (DRAM)
Group, MIT, Cambridge, MA
V. Ilyin, "Widespread occurrence of non-sequential relations among
proteins detected by TOPOFIT. Do proteins evolve as we think they
do?" November
20, 2007, University of Chicago & Argonne National Lab, Chicago.
V. Ilyin, Non-sequential alignments in protein structure comparison:
rare exceptions or protein feature? Northeastern University, on Sep 19, 2007
V. Ilyin, “An accurate structural alignment by TOPOFIT”, Boston
University, on Mar 28, 2006.
V. Ilyin, "Friend, an Integrated Front-End Application for
Bioinformatics", Clark University, Worcester, on Apr 27, 2004.
V. Ilyin, Mapping sequence features on protein structure and vise
versa. Second Northeast Bioinformatics Consortium
Conference October 24-25, 2003, Boston, MA.
V. Ilyin, Applying Bioinformatics for Molecular
Modeling, Oct 20, 2003, Talk at Bouve College,
Northeastern University
V. Ilyin, Bioinformatics Methods, Feb 28, 2003, Computer
Science College, Boston
V. Ilyin, Non-Polar Nuclei in Proteins and Front-End
Applications for Bioinformatics, Virginia Tech University, May 12, 2002
V. Ilyin, Non-Polar Nuclei in Proteins and Front-End
Applications for Bioinformatics, Koln University, Germany, Apr 17, 2002
National Institute of Health, National Library of Medicine, RO1,
Accurate protein structural comparisons by TOPOFIT. 1R01LM009519-01A1
PI: Valentin Ilyin, $714,600, 2008-2011.
The goal of the project is to employ the advantages and new
opportunities provided by the TOPOFIT approach to the systematic analysis of
protein structures in general, and together with many other emerging methods to
apply specific biological problems toward developing a new insight into protein
stability, functionality, specificity and evolution, and to facilitate the
development of new therapeutics to cure diseases.
ICSS (Institute of Complex Scientific Software), $24,000, Valentin
Ilyin as co-PI, together with Computer and Engineering Dept.)
Integration of metabolic pathways and diseases information into our
StructureSNP web server to address specific
biomedical problems.
Main public web site: http://ilyinlab.org
Friend is a bioinformatics application designed for simultaneous
analysis and visualization of multiple structures and sequences of proteins
and/or DNA/RNA. The application provides extended functionalities such as:
structure visualization with different rendering and coloring, sequence
alignment, and simple phylogeny analysis, along with a number of advanced
features to perform more complex analyses of sequence structure relationships,
including: structural alignment of proteins, investigation of specific
interaction motifs, studies of protein-protein, protein-DNA, and protein-ligand interactions in protein super-families. Friend is
also useful for the functional annotation of proteins, target identification,
protein modeling, and protein folding studies. Friend provides three levels of
usage; 1) an extensive GUI for a scientist with no programming experience, 2) a
command line interface for scripting for a scientist with some programming
experience, and 3) the ability to extend Friend with user written libraries for
an experienced programmer. The application is linked and communicates with
local and remote sequence and structure databases. Friend is also now available
in Applet form, which empowers users with all the functionality currently found
in Friend, and provides a new web-based presentation platform, with detailed
organization and manipulation of structure/sequence information, at the press
of a button. Friend is a popular Bioinformatics application, with > 100 downloads per month since 2005. We are constantly upgrading
the Friend software package with new functionality. http://ilyinlab.org/friend
TOPOFIT-DB (T-DB) is a structure database of structural relations between
all protein. Currently T-DB has some specific aims:
1). First to help researchers locate and analyze the structure neighbors found
by the TOPOFIT method; including functional amino acids conservation,
structural core analysis, and flexible region analysis. 2). Secondly, to give
researchers, such as crystallographers, a portal through which one-to-all
comparison of a newly determined structure against the entire PDB can be
carried out. 3). Finally, through its online visualization software (Friend),
provide users with the ability to quickly analyze the structural alignments
stored in T-DB. http://ilyinlab.org/topofit
*) The new T-DB 2009 with all the up-to-date structural relations has
already been calculated and we are in process of finalizing and releasing the
update.
SNPs located within the open reading frame of a gene
that result in an alteration in the amino acid sequence of the encoded protein
[nonsynonymous SNPs (nsSNPs)] might directly or indirectly affect functionality
of the protein, alone or in the interactions in a multi-protein complex, by
increasing/decreasing the activity of the metabolic pathway. Understanding the
functional consequences of such changes and drawing conclusions about the molecular
basis of diseases, involves integrating information from multiple heterogeneous
sources including sequence, structure data and pathway relations between
proteins. The data from NCBI's SNP database (dbSNP), gene and protein databases from Entrez,
protein structures from the PDB and pathway information from KEGG have all been
cross referenced into the StSNP web server, in an effort to provide combined
integrated, reports about nsSNPs. StSNP provides 'on
the fly' comparative modeling of nsSNPs with links to
metabolic pathway information, along with real-time visual comparative analysis
of the modeled structures using the Friend software application. The use of metabolic pathways in StSNP allows a researcher to examine
possible disease-related pathways associated with a particular nsSNP(s), and
link the diseases with the current available molecular structure data.
The server is publicly available at http://ilyinlab.org
General: almost 20 years of programming experience, almost 10 years in
leading software development projects including all the public resources
presented on the web site and also:
C/C++ ~200,000 or more
lines of code on variety of different projects, 1988 - current
Java ~100,000
lines; 2000 - current
Fortran ~1.5 MB of code,
1995 - 2000
Perl/HTML/XML/PHP
~ 100,000 lines, 2000 - current
MySQL:
design and development of several large DBs,
MODBASE, LigBASE, DBAli,
TOPOFIT-DB, SEDB, StSNP; available on the web. Development of large-scale pipelines on a number of multi-node
clusters to feed the DBs. 100,000,000s of
calculations.
OS: Linux/Unix, Windows, Mac X, networking, multi-core cluster.
Statistical: ROOT, SPSS, Matlab, R.
Computational biology/Bioinformatics software:
MODELLER (participated in development for 2.5
years), Swiss-MODEL.
Macromolecular visualization: RasMol, DeepView (SwissPDBViewer), MolMol, Chimera, PyMol, have
developed powerful analytical-visualization package software Friend (~20 MB of
code in C++/Java/Perl)
Molecular Dynamics: GROMOS, NADM/VMD, Amber/CHARMM force field
Crystallography software: CCP4, Xplor/CNS, O,
QUANTA, MICE, etc.
There are three main objectives:
Theoretical-computational research, a combination of theoretical
developments in understanding the underlying principals of molecules in
biology, their interactions assembly and functionality with the large-scale
analyses and generalizations toward biological systems functionality,
Application of all available Bioinformatics/Biostatistics and
Computational Biology tools to address real biomedical problems in
collaboration with experiment,
Tools development, including computer applications,
web servers, databases and other resources.
My research interest is functionality of organism at molecular level,
which includes protein-protein protein-RNA and DNA-protein interactions along
with ligand- protein interactions, protein structure
comparison, protein classification and functionality,
sequence-structure-function relationship, analysis of genome variations,
integration sequence, structure, expression and metabolic information to better
understand molecular basis of diseases.
I am very open to collaborations with experimental labs and have
successfully accomplished a number of collaborative projects with biologists,
chemists and programmers, please see the list of publications; professional in
large-scale bioinformatics data analysis, including DNA/RNA and protein
sequences, their variation, protein structures and their complexes and
interactions.
Protein X-ray
crystallography: Participation in solution of 8 PDB structures. Highlight: solution of
ligand-free structure of tryptophanyl-tRNA
synthetase (TrpRS)
(including original building the protein model into electron density),
refinement; work on Maximum Entropy Method for Refinement of Protein Structures
and application of it towards the solution of TrpRS.
Major contribution to X-ray solution of several other TrpRS
complexes.
Protein – ligand interactions: Development of a database of families of aligned ligand binding sites in known protein sequences and
structures, LigBase; StSNP, TrpRS
complexes, TOPOFIT-THEMATICS structure based active site prediction, Modeling
of enzymes involved in sterol/isoprenoid biosynthesis
Prediction of
protein-protein interfaces and Molecular Dynamics analysis: Interaction between
AP-ENDO and POL-BETA binding with DNA in the DNA-repair mechanism.
Genome variations: Integration of protein
structure, nsSNPs and metabolic pathways into public
web server StSNP.
Protein alignments: Development of an original
objective method for protein structure comparison, TOPOFIT; application of the
TOPOFIT method to comparison of all available structures in PDB and development
and maintenance of web based database of structure alignment, TOPOFIT-DB, along
with one-to-all web based and email based T-server. Classification of protein
domains based on TOPOFIT method.
Bioinformatics toolbox: Integrated Front-End
application for multiple structure visualization and multiple sequence
alignment, Friend, with over 200 functions, it is a bioinformatics application
designed for simultaneous analysis and visualization of multiple structures and
sequences of proteins and/or DNA/RNA. The application provides basic
functionalities such as: structure visualization with different rendering and
coloring, sequence alignments, and simple phylogeny analysis, along with a
number of extended features to perform more complex analyses of sequence
structure relationships, including: structure alignment of proteins,
investigation of specific interaction motifs, studies of protein-protein and
protein-DNA interactions, and protein super-families.
Protein structure prediction Large–scale
comparative modeling pipeline, a database for protein models, MODBASE; on
modeling of genomic variations leading to non-synonymous SNPs
in StructureSNP web server; tools for modeling
– real-time alignment with multiple alignments and modeling through
Friend software.