База знаний по цитохромам Р450: разработка и применение
Диссертация
Концепция замещения ранее созданных баз данных — статических информационных ресурсов — динамически развивающимися базами знаний обусловлена необходимостью оперативной разноплановой обработки накапливающегося экспериментального материала. Отличительным признаком базы знаний является гибкая структура данных, способная эффективно адаптироваться к быстро меняющимся условиям поставленной задачи… Читать ещё >
Список литературы
- Айвазян С.А., Енюков И. С., Мешалкин Л. Д. (1983). Прикладная статистика. Основы моделирования и первичная обработка данных. Москва, Финансы и статистика. 1:471.
- Арчаков А.И. (1975). Микросомальное окисление. М., Наука.
- Астахова Т.В., Олейникова М. А., Ройтберг М. А. (2002). Сравнительный анализ информационных биополимеров. //В кн.: Компьютеры и суперкомпьютеры в биологии (под ред. Лахно В. Д. и Устинина М.Н.). Москва-Ижевск, Институт компьютерных исследований. 449−457.
- Блюменфельд Л.А. (1977). Проблемы биологической физики. М., Наука.
- Бородовский М.Ю., Певзнер П. А. (1990). Статистические методы анализа генетических текстов. //В кн.: Компьютерный анализ генетических текстов. Москва, Наука. 36−80.
- Волькенштейн М.В. (1986). Энтропия и информация. Москва, Наука.
- Гусев С.А. (2002). Структурно-функциональные мотивы в последовательностях цитохромов Р450. //Диссертация на соискание ученой степени кандидата биологических наук. ГУ НИИ БМХ РАМН им. В. Н. Ореховича, Москва.
- Дегтяренко К.Н. (1992). Множественное выравнивание и анализ гомологии в надсемействе Р450. //Диссертация на соискание ученой степени кандидата биологических наук. Институт биологической и медицинской химии, Москва.
- Иванов A.C., Скворцов B.C., Сеченых A.A., Дубанов A.B., Лисица A.B. (2003). Компьютерное моделирование трехмерной структуры цитохрома Р450. //Биомедицинская химия. 49:221−37.
- Лисица A.B. (2002). Протеомный индекс надсемейства цитохромов Р450. //Диссертация на соискание ученой степени кандидата биологических наук. ГУ НИИ БМХ РАМН им. В. Н. Ореховича, Москва.
- Лисица A.B., Гусев С. А., Мирошниченко Ю. В., Кузнецова Г. П., Лазарев В. Н., Скворцов B.C., Карузина И. И., Говорун В. М., Арчаков А. И. (2004). Структурно-функциональные мотивы стероловых 14-альфа-деметилаз (CYP51). // Биомедицинская химия. 6:555−567.
- Лисица A.B., Мирошниченко Ю. В., Иванов A.C., Арчаков А. И. (2003). Общее и частное в структурной организации белков надсемейства цитохромов Р450. //Аллергия, астма и клиническая иммунология. 8:14−19.
- Мирошниченко Ю.В. (2006). Общее и частное в структурной организации белков надсемейства цитохромов Р450. //Диссертация на соискание ученой степени кандидата биологических наук. ГУ НИИ БМХ РАМН им. В. Н. Ореховича, Москва.
- Рубин А.Б. (2004). Биофизика в 2-х томах. Т.1: Теоретическая биофизика: Учебник. Москва: МГУ, Наука. Глава 3.
- Филимонов Д.А., Поройков В. В. (2006). Прогноз спектра биологической активности органических соединений. //Российский Химический Журнал. 2: 66−75.
- Фоменко А.Е., Соболев Б. Н., Филимонов Д. А., Поройков В. В. (2003). Применение структурных MNA дескрипторов для построения профилей белковых семейств. //Биофизика. 48:595−605.
- Черныш М.Ф. (2000). Опыт применения кластерного анализа. //Социология: 4 М. 12:129−141.
- Яцкив И., Гусарова JI. (2003). Методы определения количества кластеров при классификации без обучения. //Transport and Telecommunication. 4:23−28.
- Abagyan R.A., Batalov S. (1997). Do aligned sequences share the same fold? //J. Mol. Biol. 273:355−368.
- Abecassis V., Urban P., Truan G., Pompon D. (2003). Exploration of natural and artificial sequence spaces: towards a functional remodelling of membrane-bound cytochrome P450s. //Biocatalysis and Biotransformation. 21:55−66.
- Al-Shahrour F., Minguez P., Vaquerizas J.M., Conde L., Dopazo J. (2005). BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments. //Nucleic Acids Res. 33: W460−4.
- Altschul S.F. (1998). Generalized affine gap costs for protein sequence alignment. //Proteins. 32:88−96.
- Altschul S.F., Bundschuh R., Olsen R, Hwa T. (2001). The estimation of statistical parameters for local alignment score distributions. //Nucleic Acids Res. 29:351−61.
- Altschul S.F., Erickson B.W. (1985). Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage. //Mol. Biol. Evol. 2:526−38.
- Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J. (1990). Basic local alignment search tool. //J. Mol. Biol. 215:403−10.
- Ananko E.A., Podkolodny N.L., Stepanenko I.L., Podkolodnaya O.A., Rasskazov D.A., Miginsky D.S., Likhoshvai V.A., Ratushny A.V., Podkolodnaya N.N., Kolchanov N.A. (2005). GeneNet in 2005. //Nucleic Acids Res. 33: D425−7.
- Andrade M.A., Valencia A. (1998). Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families. //Bioinformatics. 14:600−607.
- Archakov A.I., Bachmanova G.I. (1990). Cytochrome P450 and Active Oxygen. Taylor & Francis, 339.
- Archakov A.I., Degtyarenko K.N. (1993). Structural classification of the P450 superfamily based on consensus sequence comparison. //Biochem Mol Biol Int. 31:1071−80.
- Archakov A.I., Lisitsa A.V., Zgoda V.G., Ivanova M.S., Koymans L. (1998). Clusterization of P450 superfamily using the objective pair alignment method and the UPGMA program. //J. Mol. Model. 4:234−238.
- Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T. (2000). Gene Ontology: tool for the unification of biology. //Nature Genet. 25:25−29.
- Attwood T.K. (2001). A compendium of specific motifs for diagnosing GPCR subtypes. //Trends Pharmacological Sci. 22:162−165.
- Attwood T.K., Bradley P., Flower D.R., Gaulton A., Maudling N., Mitchell A.L., Moulton G., Nordle A., Paine K., Taylor P., Uddin A., Zygouri C. (2003). PRINTS and its automatic supplement, prePRINTS. //Nucleic Acids Res. 31:400−402.
- Bader G.D., Betel D., Hogue C.W. (2003). BIND: the Biomolecular Interaction Network Database. //Nucleic Acids Res. 31:248−50.
- Bairoch A. (2000). The ENZYME database in 2000. //Nucleic Acids Res. 28:304 305.
- Barney B.M. (2006). Classification of proteins based on minimal modular repeats: lessons from nature in protein design. //J. Proteome Res. 5:473−82.
- Benson D. A, Karsch-Mizrachi I., Lipman D.J., Ostell J., Wheeler D.L. (2006). GenBank. //Nucleic Acids Res. 34:16−20.
- Berger M.P., Munson P.J. (1991). A novel randomized iterative strategy for aligning multiple protein sequences. //Comput Appl Biosci. 7:479−84.
- Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., Bourne P.E. (2000). The Protein Data Bank. //Nucleic Acids Res. 28:235−42.
- Bernhardt R. (2004). Optimized chimeragenesis- creating diverse p450 functions. //Chem Biol. 11:287−8.
- Bhat T.N., Bourne P., Feng Z., Gilliland G., Jain S., Ravichandran V., Schneider B., Schneider K., Thanki N., Weissig H., Westbrook J., Berman H.M. (2001). The PDB data uniformity project. //Nucleic Acids Res. 29:214−218.
- Blaschke C., Andrade M.A., Ouzounis C., Valencia A. (1999). Automatic extraction of biological information from scientific text: Protein-protein interactions. //International Conference on Intelligent Systems for Molecular Biology. 7:60−67.
- Boguski M.S., Lowe T.M., Tolstoshev C.M. (1993). dbEST—database for «expressed sequence tags». //Nature Genet. 4:332−333.
- Borodina Yu., Sadym A., Filimonov D., Blinova V., Dmitriev A., Poroikov V. (2003). Predicting biotransformation potential from molecular structure. J. Chem. Inform. //Comput. Sci. 43:1636−1646.
- Brooks B.R. (1983). CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. //J. of Computational Chemistry. 4:187 217.
- Bucher P., Bairoch A. (1994). A generalized profile syntax for biomolecular sequence motifs and its function in automatic sequence interpretation. //Proc Int Conf Intell Syst Mol Biol. 2:53−61.
- Bucher P., Karplus K., Moeri N., Hofmann K. (1996). A flexible motif search technique based on generalized profiles. //Comput. Chem. 20:3−23.
- Burrage K., Hood L., Ragan M.A. (2006). Advanced computing for systems biology. //Brief Bioinform. 7:390−8.
- Chakrabarti S., Van den Berg M., Dom B. (1998). Focused crawling: A new approach to topic-specific web resource discovery. //Proc. of the WWW-8, May.
- Chefson A., Auclair K. (2006). Progress towards the easier use of P450 enzymes. //Mol Biosyst. 2:462−9.
- Cohen M.B., Feyereisen R.(1995). A cluster of cytochrome P450 genes of the CYP6 family in the house fly. DNA Cell Biol. 14:73−82.
- Cowie J., Lehnert W. (1996). Information Extraction. //Communications of the ACM. 39:80−91.
- Davies D.L., Bouldin D.W. (1979). A cluster separation measure. //Pattern Anal. Machine Intell. 1:224−227.
- Dayhoff M.O., Barker W.C., Hunt L.T. (1983). Establishing homologies in protein sequences. //Methods Enzymol.91:524−545.
- Dayhoff M.O., Schwartz R.M., Orcutt B.C. (1978). In Atlas of Protein Sequence and Structure (ed. M.O. Dayhoff, ed.). National Biomedical Research Foundation, Washington, DC. 3:345.
- Degtyarenko K.N., Archakov A.I. (1993). Molecular evolution of P450 superfamily and P450-containing monooxygenase systems. //FEBS Lett. 332:1−8.
- Deken J. (1983). Probabilistic behavior of longest-common-subsequence length. //In Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison. Sankoff D., Kruskal J.B. (eds.). Addison-Wesley, Reading MA., 55−91.
- Dembo A., Karlin S., Zeitouni O. (1994). Limit distribution of maximal non-aligned two-sequence segmental score. //Ann. Prob. 22:2022−2039.
- Doolittle R.F. (1986). Of URFs and ORFs: a primer On How To Analyze derived amino acid sequences. University Science Books, Mill Valley, California.
- Efron B., Halloran E., Holmes S. (1996). Bootstrap confidence levels for phylogenetic trees. //Proc Natl Acad Sci USA. 93:13 429−34.
- Eggertsen G., Olin M., Andersson U., Ishida H., Kubota S., Hellman U., Okuda K.I., Bjorkhem I. (1996). Molecular cloning and expression of rabbit sterol 12alpha-hydroxylase. //J. Biol Chem. 271:32 269−75.
- Eisen J.S. (1998). Genetic and molecular analyses of motoneuron development. //Curr OpinNeurobiol. 8:697−704.
- Ekins S., Bravi G., Wikel J.H., Wrighton S.A. (1999). Three-dimensional-quantitative structure activity relationship analysis of cytochrome P-450 3A4 substrates. //J. Pharmacol. Exp. Ther. 291:424−33.
- Ekins S., Wrighton S.A. (2001). Application of in silico approaches to predicting drug-drug interactions. //J. Pharmacol. Toxicol. Methods. 45:65−9.
- Estabrook R.W. (2003). A passion for P450s (rememberances of the early history of research on cytochrome P450). //Drug Metab. Dispos. 31:1461−73.
- Etzold T., Ulyanov A.V., Argos P. (1996). SRS: information retrieval system for molecular biology data banks. //Methods Enzymol. 266:114−128.
- Fabian P., Degtyarenko K.N. (1997). The directory of P450-containing systems in 1996. //Nucleic Acids Research. 25:274−277.
- Felsenstein J. (1988). Phylogenies from molecular sequences: inference and reliability. //Annu. Rev. Genet. 22:521−565.
- Feng D.F., Doolittle R.F. (1987). Progressive sequence alignment as a prerequisite to correct phylogenetic trees. //J. Mol. Evol. 25:351−60.
- Filimonov D., Poroikov V., Borodina Yu., Gloriozova T. (1999). Chemical Similarity Assessment through multilevel neighborhoods of atoms: definition and comparison with the other descriptors. //J. Chem. Inf. Comput. Sci. 39:666−670.
- Fitch W.M. (1970). Distinguishing homologous from analogous proteins. //Systematic Zoology. 19:99−106.
- Fitch W.M. (1983). Random sequences. //J. Mol. Biol. 163:171−6.
- Fitch W.M. (2000). Homology a personal view on some of the problems. //Trends Genet. 16:227−231.
- Fleischmann W., Moller S., Gateau A., Apweiler R (1999). A novel method for automatic functional annotation of proteins. //Bioinformatics. 15:228−233.
- FlyBase Consortium (1994). FlyBase the Drosophila database. //Nucleic Acids Res. 22:3456−3458.
- FlyBase Consortium (2003). The FlyBase database of the Drosophila genome projects and community literature. //Nucleic Acids Res. 31:172−175.
- Fogleman J.C. (2000). Response of Drosophila melanogaster to selection for P450-mediated resistance to isoquinoline alkaloids. //Chem. Biol. Interact. 125:93−105.
- Fomenko A.E., Filimonov D.A., Sobolev B.N., Poroikov V.V. (2006). New approach to predict enzyme function without the alighnment. //OMICS: A Journal of Integrative Biology. 10:56−65.
- Fukuda K., Tsunoda T., Tamura A., Takagi T. (1998). Toward information extraction: identifying protein names from biological papers. //Pac. Symp. Biocomput., 707−718.
- Gaizauskas R., Wilks Y. (1998). Information Extraction: Beyond Document Retrieval. Journal of Documentation. 54:70−105.
- Garcia C. A, Chen Y.P., Ragan M.A. (2005). Information integration in molecular bioscience. //Appl Bioinformatics. 4:157−73.
- Gasteiger E., Gattiker A., Hoogland C., Ivanyi I., Appel R.D., Bairoch A. (2003). ExPASy: The proteomics server for in-depth protein knowledge and analysis. //Nucleic Acids Res. 31:3784−8.
- Gattiker A., Gasteiger E., Bairoch A. (2002). ScanProsite: a reference implementation of aPROSITE scanning tool. //Applied Bioinform. 1:107−108.
- Gell-Mann M. (1994). A child learning the language: Algorithmic complexity and informational content. //The quark and the jaguar: adventures in the simple and the complex. W.H. Freeman and Company: New York, 58−60.
- Gilardi G., Meharenna Y.T., Tsotsou G.E., Sadeghi S.J., Fairhead M., Giannini S. (2002). Molecular Lego: design of molecular assemblies of P450 enzymes for nanobiotechnology. //Biosens Bioelectron. 17:133−45.
- Gonzalez F.J., Gelboin H.V. (1992). Human cytochromes P450: evolution and cDNA-directed expression. //Environ Health Perspect. 98:81−85.
- Goto S., Nishioka T., Kanehisa M. (1998). LIGAND: chemical database for enzyme reactions. //Bioinformatics. 14:591−599.
- Gotoh O. (1990). Consistency of optimal sequence alignments. //Bull Math Biol. 52:509−25.
- Gotoh O. (1992). Substrate recognition sites in cytochrome P450 family 2 (CYP2) proteins inferred from comparative analyses of amino acid and coding nucleotide sequences. //J. Biol. Chem. 267:83−90.
- Gotoh O. (1993). Optimal alignment between groups of sequences and its application to multiple sequence alignment. Comput Appl Biosci. 9:361−70.
- Gotoh O. (1999). Multiple sequence alignment: algorithms and applications. //Adv. Biophys. 36:159−206.
- Gotoh O. (2000). Homology-based gene structure prediction: simplified matching algorithm using a translated codon (tron) and improved accuracy by allowing for long gaps. //Bioinformatics. 16:190−202.
- Graham S.E., Peterson J.A. (2002). Sequence alignments, variabilities, and vagaries. //Methods Enzymol. 357:15−28.
- Gribskov M., Luthy R., Eisenberg D. (1990). Profile analysis. //Methods Enzymol. 183:146−159.
- Gribskov M., McLachlan A.D., Eisenberg D. (1987). Profile analysis: detection of distantly related proteins. //Proc. Natl. Acad. Sci. USA. 84:4355−4358.
- Guengerich F.P. (1992). Characterization of human cytochrome P450 enzymes. //FASEB J. 6:745−8.
- Guex N., Peitsch M.C. (1997). SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling. //Electrophoresis. 18:2714−2723.
- Gumbel EJ. (1958). Statistics of extremes. Columbia Iniversity Press, New York, NY.
- Gunsalus I.C., Pederson T.C., Sligar S.G. (1975). Oxygenase-catalyzed biological hydroxylations. IIAram. Rev. Biochem. 44:377−407.
- Halkidi M., Batistakis Y., Vazirgiannis (2001). On clustering Validation Techniques.//Journal of Intelligent Information Systems. 17:107−145.
- Harris T., Lee R., Schwarz E., Bradnam K., Lawson D., Chen W., Blasier D., Kenny E., Cunningham F., Kishore R. (2003). WormBase: a cross-species database for comparative genomics. //Nucleic Acids Res. 31:133−137.
- Hayaishi O. (1974). Molecular Mechanisms of 02 Activation. Academic, New York.
- Heinemann M., Panke S. (2006). Synthetic biology-putting engineering into biology. //Bioinformatics. 22:2790−9.
- Henikoff S., Greene E.A., Pietrokovski S., Bork P., Attwood T.K., Hood L. (1997). Gene families: the taxonomy of protein paralogs and chimeras. //Science. 278:609−614.
- Henikoff S., Henikoff J.G. (1992). Amino acid substitution matrices from protein blocks. //Proc. Natl. Acad. Sci. USA. 89:10 915−9.
- Henikoff S., Henikoff J.G. (1993). Performance evaluation of amino acid substitution matrices. //Proteins. 17:49−61.
- Hersh W.R., Evans D.A., Monarch I.A., Lefferts R.G., Handerson S.K., Gorman P.N. (1992). Indexing Effectiveness of Linguistic and Non-Linguistic Approaches to Automatic Indexing. Elsevier Science Publishers, Amsterdam.
- Higgins D.G., Bleasby A.J., Fuchs R. (1992). CLUSTAL V: improved software for multiple sequence alignment. //Comput. Appl. Biosci. 8:189−91.
- Higgins D.G., Thompson J.D., Gibson T.J. (1996). Using CLUSTAL for multiple sequence alignments. //Methods Enzymol. 266:383−402.
- Hoogland C., Sanchez J.-C., Tonella L., Binz P.-A., Bairoch A., Hochstrasser D.F., Appel R.D. (2000). The 1999 SWISS-2DPAGE database update. //Nucleic Acids Res. 28:286−288.
- Hubbard T., Barker D., Birney E., Cameron G., Chen Y., Clark L., Cox T., Cuff J., Curwen V., Down, T. (2002). The Ensembl genome database project. //Nucleic Acids Res. 30:38−41.
- Huynen M.A., Bork P. (1998). Measuring genome evolution. //Proc. Natl. Acad. Sci. USA. 95:5849−5856.
- Ioannides C., Lewis D.F., Parke D.V. (1993). Computer modelling in predicting carcinogenicity. //Eur. J. Cancer Prev. 2:275−82.
- Jonassen I., Collins J.F., Higgins D.G. (1995). Finding flexible patterns in unaligned protein sequences. //Protein Sci. 4:1587−95.
- Kalinina O.V., Mironov A.A., Gelfand M.S., Rakhmaninova A.B. (2004a). Automated selection of positions determining functional specificity of proteins by comparative analysis of orthologous groups in protein families. //Protein Sci. 13:443−456.
- Kalita M.K., Ramasamy G., Duraisamy S., Chauhan V.S., Gupta D. (2006). ProtRepeatsDB: a database of amino acid repeats in genomes. //BMC Bioinformatics. 7:336.
- Kanehisa M. (1997). A database for post-genome analysis. //Trends Genet. 13:375−376.
- Kanehisa M., Goto S. (2000). KEGG: kyoto encyclopedia of genes and genomes. //Nucleic Acids Res. 28:27−30.
- Kanehisa M., Goto S., Hattori M., Aoki-Kinoshita K.F., Itoh M., Kawashima S., Katayama T., Araki M., Hirakawa M. (2006). From genomics to chemical genomics: new developments in KEGG. //Nucleic Acids Res. 34: D354−7.
- Kanehisa M., Goto S., Kawashima S., Okuno Y., Hattori M. (2004). The KEGG resource for deciphering the genome. //Nucleic Acids Res. 32: D277-D280.
- Kans J.A., Ouellette B.F.F. (2001). Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. Baxevanis A., Ouellette B.F.F., editors. New York, NY: John Wiley and Sons, Inc.- 65−81.
- Karlin S., Altschul S.F. (1990). Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. //Proc. Natl. Acad. Sci. USA. 87:2264−2268.
- Karlin S., Brocchieri L., Bergman A., Mrazek J., Gentles A.J. (2002). Amino acid runs in eukaryotic proteomes and disease associations. //Proc. Natl. Acad. Sci. USA. 99:333−8.
- Kimura M. (1983). The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge, UK.
- Kimura M. (1991). The neutral theory of molecular evolution: a review of recent evidence. //Jpn J. Genet. 66:367−86.
- Kocsor A., Kertesz-Farkas A., Kajan L., Pongor S. (2006). Application of compression-based distance measures to protein sequence classification: a methodological study. //Bioinformatics. 22:407−12.
- Kolchanov N.A., Ananko E.A., Kolpakov F.A., Podkolodnaya O.A., Ignateva E.V., Goiyachkovskaya T.N., Stepanenko I.L. (2000). Gene networks. //Mol. Biol. (Moskow). 34:449−460.
- Krull M., Voss N., Choi C., Pistor S., Potapov A., Wingender E. (2003). TRANSPATH: an integrated database on signal transduction and a tool for array analysis. //Nucleic Acids Res. 31:97−100.
- Kuralenok I., Dobiynin V., Nekrestyanov I., Bessonov M., Patel A. (1999). Distributed search in topic-oriented document collections. //Proc. of World Multiconference on Systemics, Cybernetics and Informatics (SCI'99). 4:377−383.
- Lamb D.C., Fowler K., Kieser T., Manning N., Podust L.M., Waterman M.R., Kelly D.E., Kelly S.L. (2002). The cytochrome P450 complement (CYPome) of Streptomyces coelicolor A3(2). //J. Biol. Chem. 277:24 000−5.
- Laskin A.A., Kudiyashov N.A., Skryabin K.G., Korotkov E.V. (2005). Latent periodicity of serine-threonine and tyrosine protein kinases and other protein families. //Comput. Biol. Chem. 29:229−43.
- Lau A.Y., Chasman D.I. (2004). Functional classification of proteins and protein variants. //Proc. Natl. Acad. Sci. USA. 101:6576−81.
- Leinonen R., Nardone F., Zhu W., Apweiler R. (2006). UniSave: the UniProtKB sequence/annotation version database. //Bioinformatics. 22:1284−1285.
- Lempel A., Ziv J. (1976). On the complexity of finite sequences. //IEEE Transactions on Information Theory. 22:75−81.
- Lesk A.M. (1988). In Lesk AM (ed.) Computational Molecular Biology. Oxford University Press, Oxford, 17−26.
- Levan G., Jacob H.J. (2006). Nomenclature of the rat Cyp genes and the problems of gene nomenclature in general. //Hum. Genomics. 2:343−4.
- Lewi P.J., Moereels H., Adriaensen D. (1992). The combination of dendrograms with plots of latent variables. An application to G-protein coupled receptor sequences.//Chem. Intell. 16:145−54.
- Lewis D., Jones K. (1996). Natural language processing for information retrieval. //Commun. ACM. 39:92−101.
- Lewis D.F. (2003). Quantitative structure-activity relationships (QSARs) within the cytochrome P450 system: QSARs describing substrate binding, inhibition and induction of P450s. //Inflammopharmacology. 11:43−73.
- Lewis D.F., Sheridan G. (2001). Cytochromes P450, oxygen, and evolution. //Scientific World Journal. 1:151 -67.
- Lipman D.J., Wilbur W.J. (1984). Interaction of silent and replacement changes in eukaryotic coding sequences. Hi. Mol. Evol. 21:161−167.
- Lisitsa A., Archakov A., Lewi P., Janssen P. (2003). Bioinformatic insight into the unity and diversity of cytochromes P450. //Methods and Findings in Experimental and Clinical Pharmacology. 25:733−745.
- Lisitsa A.V., Gusev S.A., Karuzina I.I., Archakov A.I. and Koymans L. (2001). Cytochrome P450 Database. //SAR QSAR Environ Res. 12:359−366.
- Liu X., Liu D., Qi J., Zheng W.M. (2002). Simplified amino acid alphabets based on deviation of conditional probability from random background. //Phys Rev E Stat Nonlin Soft Matter Phys. 66:21 906.
- Lo Conte L., Ailey B., Hubbard T.J., Brenner S.E., Murzin A.G., Chothia C. (2000). SCOP: a structural classification of proteins database. Nucleic Acids Res. 28:257−9.
- Lopez P., Casane D., Philippe H. (2002). Heterotachy, an important process of protein evolution. //Mol. Biol. Evol. 19:1−7.
- MacKerell A.D. Jr. (1998). All-atom empirical potential for molecular modeling and dynamics Studies of proteins. //Journal of Physical Chemistry. 102:3586−3616.
- Mann HJ. (2006). Drug-associated disease: cytochrome P450 interactions. //Crit. Care. Clin. 22:329−45.
- Mao B., Gozalbes R., Barbosa F., Migeon J., Merrick S., Kamm K., Wong E., Costales C., Shi W" Wu C., Froloff N. (2006). QSAR modeling of in vitro inhibition of cytochrome P450 3A4. //J. Chem. Inf. Model. 46:2125−34.
- Marcotte E.M., Xenarios I., Eisenberg D. (2001). Mining literature for proteinprotein interactions. //Bioinformatics. 17:359−63.
- Matsunaga L., Yamada A., Lee D. S. (2002). Enzymatic reaction of hydrogen peroxide—dependent peroxygenase cytochrome P450s: kinetic deuterium isotope effects and analyses by resonance Raman spectroscopy. Biochemistry. 41:1886−1892.
- McGinnis S., Madden T. (2004). BLAST: at the core of a powerful and diverse set of sequence analysis tools. //Nucleic Acids Res. 32: W20-W25.
- McKusick V.A. (1998). Mendelian Inheritance in Man. Catalogs of Human Genes and Genetic Disorders. //12th edn. Baltimore, MD: The Johns Hopkins University Press.
- Meyer M.M., Hochrein L., Arnold F.H. (2006). Structure-guided SCHEMA recombination of distantly related beta-lactamases. //Protein Eng. Des. Sel. 19:56 370.
- Mimy L.A., Gelfand M.S. (2002). Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors. //J. Mol. Biol. 321:7−20.
- Moereels H., Lewi P.J., Koymans L.M., Janssen P.A. (1997). The alpha and omega of G protein-coupled receptors. A novel method for classification. //Ann. NY Acad. Sci. 812:147−8.
- Morgenstern B. (1999). DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. //Bioinformatics. 15:211−218.
- Mosteller F., Wallace D.L. (1984). Applied Bayesian and Classical Inference: the Case of the Federalist Papers. Springer, New York.
- Mulder N., Apweiler R., Attwood T., Bairoch A., Barrell D., Bateman A., Binns D., Biswas M., Bradley P., Bork P. (2003). The InterPro Database, 2003 brings increased coverage and new features. //Nucleic Acids Res. 31:315−318.
- Muppirala U.K., Li Z. (2006). A simple approach for protein structure discrimination based on the network pattern of conserved hydrophobic residues. //Protein Eng. Des. Sel. 19:265−75.
- Murakami K., Mihara K., Omura T. (1994). The transmembrane region of microsomal cytochrome P450 identified as the endoplasmic reticulum retention signal. IIJ. Biochem (Tokyo). 116:164−75.
- Nagarajan N., Jones N., Keich U. (2005). Computing the P-value of the information content from an alignment of multiple sequences. //Bioinformatics. I: i311−8.
- Nebert D.W., Adesnik M., Coon M.J. (1987). The P450 gene superfamily: recommended nomenclature. //DNA. 6:1−11.
- Nebert D.W., Jaiswal A.K., Meyer U.A., Gonzalez F.J. (1987). Human P-450 genes: evolution, regulation and possible role in carcinogenesis. //Biochem. Soc. Trans. 15:586−9.
- Nebert D.W., Nelson D.R. (1991). P450 gene nomenclature based on evolution. //Methods Enzymol. 206:3−11.
- Nebert D.W., Nelson D.R., Feyereisen R. (1989). Evolution of the cytochrome P450 genes. //Xenobiotica. 19:1149−1160.
- Needleman S.B., Wunsch C.D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. //J. Mol. Biol. 48:44 353.
- Nekrasov A.N. (2004). Analysis of the information structure of protein sequences: a new method for analyzing the domain organization of proteins. //J. Biomol. Struct. Dyn. 21:615−24.
- Nekrestyanov I., O’Meara T., Romanova E. (1999). Building topic-specific collections with intelligent agents. Proc. //Of Sixth International Conference on Intelligence in Services and Networks (IS&N'99), Barcelona, Spain, April.
- Nelson D.R. (1999). Cytochrome P450 and the individuality of species. //Arch. Biochem. Biophys. 369:1−10.
- Nelson D.R. (2004). Frankenstein genes, or the Mad Magazine version of the human pseudogenome.//Hum. Genomics. 1:310−316.
- Nelson D.R. (2005). Gene nomenclature by default, or BLASTing to Babel. //Hum Genomics. 2:196−201.
- Nelson D.R., Strobel H.W. (1987). Evolution of cytochrome P-450 proteins. //Mol. Biol. Evol. 4:572−593.
- Nelson DR. (2006). Cytochrome P450 nomenclature, 2004. //Methods Mol. Biol. 320:1−10.
- Nenadic G., Mima H., Spasic I., Ananiadou S., Tsujii J. (2002). Terminology-driven literature mining and knowledge acquisition in biomedicine. //Int. J. Med. Inform. 67:33−48.
- Ng S., Wong M. (1999). Toward Routine Automatic Pathway Discovery from Online Scientific Text Abstracts. //Genome Inform Ser Workshop Genome Inform. 10:104−112.
- Pang H., Tang J., Chen S.S. (2005). Statistical distributions of optimal global alignment scores of random protein sequences. //BMC Bioinformatics. 6:257.
- Papka R., Allan J. (1998). Document classification using multiword features. //Proc. Of the CIKM'98.124−131.
- Peitsch M.C. (1995). Protein modelling by E-Mail. //Biotechnology. 13:658−660.
- Peitsch M.C. (1997). Large scale protein modelling and model repository. //Proc. Int. Conf. Intell. Syst. Mol. Biol. 5:234−236.
- Podust L.M., Poulos T.L., Waterman M.R. (2001). Crystal structure of cytochrome P450 14alpha -sterol demethylase (CYP51) from Mycobacterium tuberculosis in complex with azole inhibitors. //Proc. Natl. Acad. Sci. USA. 98:3068−73.
- Proux D., Rechenmann F., Julliard L., Pillet V., Jacq B. (1998). Detecting Gene Symbols and Names in Biological Texts: A First Step toward Pertinent Information Extraction. //Genome Inform Ser Workshop Genome Inform. 9:72−80.
- Pruitt K., Maglott D. (2001). RefSeq and LocusLink: NCBI gene-centered resources. //Nucleic Acids Res. 29:137−140.
- Pruitt K., Tatusova T., Maglott D. (2005). Entrez Gene. //Nucleic Acids Res. 33: D54-D58.
- Pruitt K.D., Tatusova T., Maglott D.R. (2005). NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. //Nucleic Acids Res. 33: D501-D504.
- Raevsky O.A. (1997). Hydrogen Bond Strengtn Estimation by means of HYBOT. //In «Computer-Assisted Lead Finding and Optimization» eds H. Waterbeemd, B. Testa, G. Folkers, Basel: Verlag, 367−378.
- Reich J.G., Drabsch H., Daumler A. (1984). On the statistical assessment of similarities in DNA sequences. //Nucl. Acids Res. 12:5529−5543.
- Rendic S., Di Carlo F. (1997). Human Cytochrome P450 Enzymes: A status report summarizing their reactions, substrates, inducers and inhibitors. //Drug Metabolism Reviews. 29:413−580.
- Rindflesch T.C., Tanabe L., Weinstein J.N., Hunter L. (2000). EDGAR: extraction of drugs, genes and relations from the biomedical literature. //Pac. Symp. Biocomput. 5:517−528.
- Rodriguez-Tome P., Stoehr P. J" Cameron G.N., Flores T.P. (1996). The European Bioinformatics Institute (EBI) databases. //Nucleic Acids Res. 24:6−12.
- Saitou N., Nei M. (1987). The neighbor joining method: a new method for reconstructing phylogenetic trees. //Mol. Biol. Evol. 4:406−425.
- Sali A., Potterton L., Yuan F., van Vlijmen H., Karplus M. (1995). Evaluation of comparative protein modeling by MODELLER. //Proteins. 23:318−26.
- Salton G. (1989). Automatic Text Processing: the transformation, analysis and retrieval of information by computer. Reading, Mass. Addison Wesley.
- Sawyer S. (1989). Statistical tests for detecting gene conversion. //Mol. Biol. Evol. 6:526−38.
- Schuler G.D. (1997). Pieces of the puzzle: expressed sequence tags and the catalog of human genes. //J. Mol. Med. 75:694−698.
- Schuler G.D., Epstein J.A., Ohkawa H., Kans J.A. (1996). Entrez: molecular biology database mid retrieval system. //Methods Enzymol. 266:141−162.
- Schwab W. (2003). Metabolome diversity: too few genes, too many metabolites? //Phytochemistry. 62:837−49.
- Scordis P., Flower, D.R. and Attwood, T.K. (1999). FingerPRINTScan: intelligent searching of the PRINTS motif database. //Bioinformatics. 15:799−806.
- Scott E.E., He Y.Q., Halpert J.R. (2002). Substrate routes to the buried active site may vary among cytochromes P450: mutagenesis of the F-G region in P450 2B1. //Chem. Res. Toxicol. 15:1407−13.
- Searls D.B. (2001). Reading the book of life. //Bioinformatics. 17:579−80.
- Seifert A., Tatzel S., Schmid R.D., Pleiss J. (2006). Multiple molecular dynamics simulations of human p450 monooxygenase CYP2C9: the molecular basis of substrate binding mid regioselectivity toward warfarin. //Proteins. 64:147−55.
- Sellers P.H. (1974). On the theory and computation of evolutionary distances. //SLAM J. Appl. 26:787−793.
- Sellers P.H. (1984). Pattern recognition in genetic sequences by mismatch density. //Bull. Math. Biol. 46:501−514.
- Shakhnovich E.I., Gutin A.V. (1990). Implication of Thermodynamics of Protein Folding for Evolution of Primary Sequences. //Nature. 346:773−775.
- Sherman B. (1950). A random variable related to the spacing of sample values. //Ann. Math. Stat. 21:339−361.
- Sherman B. (1957). Percentiles of the w (n) statistic. //Ann. Math. Stat. 28:259 261.
- Sheriy S.T., Ward M.H., Kholodov M., Baker J., Pham L., Smigielski E., Sirotkin K. (2001). dbSNP: The NCBI database of genetic variation. //Nucleic Acids Res. 29:308−311.
- Shi J., Blundell T.L., Mizuguchi K. (2001). FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. //J. Mol. Biol. 310:243−57.
- Shrager J. (2003). The fiction of function. //Bioinformatics. 19:1934−6.
- Sigrist C.J.A., Cerutti L., Hulo N., Gattiker A., Falquet L., Pagni M., Bairoch A., Bucher P. (2002). PROSITE: a documented database using patterns and profiles as motif descriptors. //Briefings Bioinform. 3:265−274.
- Simpson A.E. (1997). The cytochrome P450 4 (CYP4) family. Gen Pharmacol. 28:351−9.
- Smith S.L., Bollenbacher W.E., Cooper D.Y., Schleyer., Weilgus J.J., Gilbert L.I. (1979). Ecdysone 20-monooxygenase: characterization of an insect cytochrome p-450 dependent steroid hydroxylase. //Mol. Cell. Endocrinol. 15:111−113.
- Smith T.F., Waterman M.S. (1981). Identification of common molecular subsequences. //J. Mol. Biol. 147:195−7.
- Smith T.F., Waterman M.S., Burks C. (1985). The statistical distribution of nucleic acid similarities. //Nucleic Acids Res. 13:645−656.
- Sneath P.H.A. (1995). The distribution of the random division of a molecular sequence. //Binary. 7:148−152.
- Sneath P.H.A. (1998). The effect of evenly spaced constant sites on the distribution of the random division of a molecular sequence. //Bioinformatics. 14:608−616.
- Sneath P.H.A., Sokal RR. (1973). Numerical Taxonomy. San Francisco: W.H. Freeman.
- Solovyev V.V., Makarova K.S. (1993). A novel method of protein sequence classification based on oligopeptide frequency analysis and its application to search for functional sites and to domain localization. //Comput. Appl. Biosci. 9:17−24.
- Sonnhammer E.L., Eddy S.R., Birney E., Bateman A., Durbin R. (1998). Pfam: multiple sequence alignments aid HMM-profiles of protein domains. //Nucleic Acids Res. 26:320−322.
- Sonnhammer E.L., Koonin E.V. (2002). Orthology, paralogy and proposed classification for paralog subtypes. //Trends Genet. 18:619−620.
- Stata R., Bharat K., Maghoul F. (2000). The term vector database: fact access to indexing terms for web pages. //Proc.of the WWW-9, May.
- Stoesser G., Baker W., van den Broek A., Garcia-Pastor M., Kanz C., Kulikova T., Leinonen R., Lin Q" Lombard V., Lopez R. (2003). The EMBL Nucleotide Sequence Database: major new development. //Nucleic Acids Res. 30:21−26.
- Sugiura A., Etzioni 0. (2000). Query routing for web search engines: Architecture and experiments. //Proc. Of the WWW-9, May.
- Susko E., Field C., Blouin C., Roger AJ. (2003). Estimation of rates-across-sites distributions in phylogenetic substitution models. //Syst. Biol. 52:594−603.
- Tatusov R.L., Fedorova N.D., Jackson J.D., Jacobs A.R., Kiryutin B., Koonin E.V., Kiylov D.M., Mazumder R., Mekhedov S.L., Nikolskaya A.N., Rao B.S. (2003). The COG database: an updated version includes eukaryotes. //BMC Bioinformatics. 4:41.
- Tatusov R.L., Koonin E.V., Lipman D.J. (1997). A genomic perspective on protein families. //Science. 278:631−637.
- Tatusova T., Karsch-Mizrachi I., Ostell J. (1999). Complete genomes in WWW Entrez: data representation and analysis. //Bioinformatics. 15:536−543.
- Tatusova T.A., Madden T.L. (1999). BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequences. //FEMS Microbiol. Lett. 174:247−50.
- Taylor W.R. (1990). Hierarchical method to align large numbers of biological sequences, //Methods Enzymol. 183:456−474.
- The International HapMap Consortium (2003). The International HapMap Project. //Nature. 426:789−796.
- The UniProt Consortium. (2007). The Universal Protein Resource (UniProt). //Nucleic Acids Res. 35: D193−7.
- Thomas J., Milward D., Ouzounis C., Pulman S., Carroll M. (2000). Automatic extraction of protein interactions from scientific abstracts. //Pac. Symp. Biocomput. 5:541−542.
- Thompson J.D., Higgins, D.G., Gibson, T.J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. //Nucleic Acids Res. 22:4673−4680.
- Turutina V.P., Laskin A.A., Kudiyashov N.A., Skiyabin K.G., Korotkov E.V. (2006). Identification of amino acid latent periodicity within 94 protein families. //J. Comput. Biol. 13:946−64.
- Unno M., Shimada H., Toba Y., Makino R., Ishimura Y. (1996). Role of Argll2 of cytochrome p450cam in the electron transfer from reduced putidaredoxin. Analyses with site-directed mutants. //J. Biol. Chem. 271:17 869−74.
- Vinga S., Almeida J. (2003). Alignment-free sequence comparison-a review. //Bioinformatics. 19:513−23.
- Wang Q., Halpert J.R. (2002). Combined three-dimensional quantitative structure-activity relationship analysis of cytochrome P450 2B6 substrates and protein homology modeling. //Drug Metab. Dispos. 30:86−95.
- Waterman M.S. (1994). Parametric and ensemble sequence alignment algorithms. //Bull. Math. Biol. 56:743−67.
- Waterman M.S., Vingron M. (1994). Rapid and accurate estimates of statistical significance for sequence data base searches. //Proc. Natl. Acad. Sci. USA. 91:4625−4628.
- Webber C., Barton G.J. (2001). Estimation of P-values for global alignments of protein sequences. //Bioinformatics. 17:1158−67.
- Werck-Reichhart D., Feyereisen R. (2000). Cytochromes P450: a success story. //Genome Biol. 1: REVIEWS3003.
- Westbrook J., Feng Z., Chen L., Yang H., Berman H. (2003). The Protein Data Bank and structural genomics. //Nucleic Acids Res. 31:489−491.
- Wieser D., Kretschmann E., Apweiler R. (2004). Filtering erroneous protein annotation. //Bioinformatics. 20:342−347.
- Wilson C.A., Kreychman J., Gerstein M. (2000). Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores. //J. Mol. Biol. 297:233−249.
- Wingender E. (1988). Compilation of transcription regulating proteins. //Nucleic Acids Res. 16:1879−1902.
- Word J.M., Lovell S.C., Richardson J.S., Richardson D.C. (1999). Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. //J. Mol. Biol. 285:1733−1747.
- Yamada S., Gotoh O., Yamana H. (2006). Related Articles, Links Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost. //BMC Bioinformatics. 7:524.
- Ye J., McGinnis S., Madden T.L. (2006). BLAST: improvements for better sequence analysis. //Nucleic Acids Res. 34: W6−9.
- Zhao J., Goble C., Stevens R. (2004). Semantically linking and browsing provenance logs for e-science. //In First International Conference on Semantics of a Networked World. 157−174.
- Zharkikh A., Li W.H. (1995). Estimation of confidence in phylogeny: the complete-and-partial bootstrap technique. //Mol. Phylogenet. Evol. 4:44−63.
- Zharkikh A.A., Rzhetsky A.Yu. (1993). Quick assessment of similarity of two sequences by comparison of their L-tuple frequencies. //Biosystems. 30:93−111.1. БЛАГОДАРНОСТИ