Supplements of ISMB 2005 paper:
Y.Yamanishi, J.-P.Vert, and M.Kanehisa,
Supervised Enzyme Network Inference from the Integration of Genomic Data and Chemical Information
Predicted enzyme network for all the enzyme candidate proteins of the budding yeast
(1120 ORFs of Saccharomyces cerevisiae)
- Correlation coefficient matrix between the proteins in feature space (text file)
- Examples of adjacency matrix between the proteins (text file)
Note that the numbers of edges can be determined by the threshold of graphical assoications estimated.
Data for metabolic network prediction
Kernel matrices (668 x 668 matrices)
- Kernel matrix of gold standard metabolic network
- Kernel matrix of gene expression
- Kernel matrix of localization
- Kernel matrix of phylogenetic profiles
- Kernel matrix of chemical compatibility network
- Kernel matrix integrating all the genomic and chemical data
- Kernel matrix integrating genomic data only (to use
chemical compatibility)
- Kernel matrix by weighted integration of all the genomic and chemical data
- Kernel matrix by weighted integration of genomic data only (to use
chemical compatibility)
Note that all kernels are then normalized to 1 on the
diagonal.
Original datasets
- ORF names of enzyme genes (668 x 1 matrix)
- ORF names and EC number list (668 x 2 matrix)
- Adjacency matrix (opposite Laplacian matrix) of enzyme genes in metabolic network in the KEGG/PATHWAY
- Adjacency matrix (opposite Laplacian matrix) of enzyme genes in chemical compatibility network based on the first 3 digits of the EC numbers
- Matrix of gene expression (668 x 157 matrix)
- Matrix of localization (668 x 23 matrix)
- Matrix of phylogenetic profiles (668 x 145 matrix)
Chemical datasets such as reactions and compounds used for constructing the
chemical compatibility network
(C number means Compound number in the KEGG/LIGAND below)
- ORF names and EC number list (1120 x 2 matrix)
- EC number and C number list extracted
from the compound data in the KEGG/LIGAND
- EC number and C number list (used organic compounds only) extracted
from the compound data in the KEGG/LIGAND
We removed inorganic compounds such as water, oxygen, and phosphate
based on the organic.lst made by Tonomura-san.
Note that, if there are no organic compounds involved in the EC number, "999999" is written.
- C number and EC number list extracted
from the compound data in the KEGG/LIGAND
- Adjacency matrix of EC numbers consisting of the fist 3 digits (199 x 199 matrix: if we count the label, 200 x 200 matrix)