Методы, алгоритмы и программы дискурсивного анализа для построения мультиязыковых тематических глоссариев
Диссертация
Разработан метод формирования тематической совокупности семантически однородных текстов (антологий) одновременно на трех языках, с интерпретационным сопоставлением ключевых слов, терминов, понятий и фраз при форматировании документов с учетом особенностей каждого языка, в отличие от общепринятого способа формирования антологии по ключевым словам с помощью поисковой машины и дословного перевода… Читать ещё >
Список литературы
- Аксенов А.Ю., Зайцева A.A., Боумедин Шаннак. Ранговый метод локализации областей текстовых данных. — «Информационно-измерительные и управляющие системы», № 4, т.9, 2011. — С. 61−65.
- Александров В.В., Арсентьева А. В., Семенков А. И. Структурный анализ диалога. Препринт № 80. — Л.: ЛНИВЦД983. — 50 с.
- Андреева H.A., Кокорин ПП. Система построения понятийной иерархии для ассоциативного поиска по текстам. «Информационно-измерительные и управляющие системы», № 4, т.6, 2008. — С. 9−13.
- Александров В.В., Андреева H.A., Кулешов C.B. Системное моделирование. Методы построения информационно-логистических систем / Учеб. пособие. — СПб.: Изд-во Политехи, ун-та, 2006. — 95 с.
- Барт Р. Избранные работы: Семиотика: Поэтика: Пер. с фр. / Сост., общ. ред. и вступ. ст. Г. К. Косикова.— М.: Прогресс, 1989. — 616 с.
- Биржаков М.Б. Введение в туризм. Учебник СПб.: Издательский Торговый Дом «Герда», 2000. — 192 с.
- БраславскийП., Соколове. «Автоматическое извлечение терминологии с использованием поисковых машин Интернета» // Компьютерная лингвистика и интеллектуальные технологии: Труды Междунар. конф. «Диалог'2007″. — М.: Изд-во РГГУ, 2007. — С.67−74.
- Быстрянцев С. Кузнецова Г. Информационные технологии в рекламе туристского продукта // Конкуренция и рынок. № 2(13). 2002.
- Все о туризме. — Интернет источник.
- Дзюбенко A.JI. Информационные технологии управления. Учебный курс (учебно-методический комплекс) — Интернет источник.
- Добров Б.Н., Лукашевич Н. В., Сыромятников C.B. Формирование базы терминологических свловосочетаний по текстам предметной области // Электронные библиотеки: Труды конференции RCDL'2003.
- Избачков Ю.С., Петров В. Н. Информационные системы — СПб.: Питер, 2006. — 656 с.
- Информационная система — Интернет источник. <�ш.wikipedia.org/wiki/Инфopмaциoннaяcиcтeмa>
- Информация — Интернет источник.
- Киршина М.В. Коммерческая логистика. — М.: Центр экономики и маркетинга, 2008. — 256 с.
- Крижановский А. А. Автоматизированный поиск семантически близких слов на примере авиационной терминологии // Автоматизация в промышленности, т. 4,2008. — С. 16−20.
- Логистика. Визитная карточка — Интернет источник. .
- Мальковский М.Г., Соловьев С. Ю. Универсальное терминологическое пространство // Труды Международного семинара Диалог'2002
- Компьютерная лингвистика и интеллектуальные технологии», т.1. — М: Наука, 2002. — С. 266−277.
- Назаров C.B. Компьютерные технологии обработки информации. Москва. «Финансы и статистика», 1996. — 249 с.
- Пирс Чарлз Сандерс. Большая Советская Энциклопедия (БСЭ)/ статья Добронравова И. С. — Интернет источник. .
- Семиотика // Энциклопедия Кругосвет. — Интернет источник. .
- Словари и энциклопедии на Академике — Интернет источник. .
- Шаннаг Б., Александров В. В. Морфологический анализатор для арабского языка (SAMA1) // «Информационно-измерительные и управляющие системы», № 11, т.7, 2009. — С.60−62.
- Шаннаг Б., Кокорин П. П., Щелкунова Е. В. Алгоритм нормализации и онтологической кластеризации текстов. // «Информационно-измерительные и управляющие системы», № 7, т.8, 2010. — С.60−63
- Шафрин Ю.А. Информационные технологии. — М.: Бином, 1998.
- Addicott Rachael- McGivern Gerry- Ferlie Ewan. Networks, Organizational Learning and Knowledge Management: NHS Cancer Networks. // Public Money & Management, Vol. 26, No. 2, April 2006. — pp. 87−94.
- Ahmad K., Tariq M., Vrusias B., Handy C. Corpus-based thesaurus construction for image retrieval in specialist domains. // In Proceedings of the 25th European Conference on Advances in Information Retrieval (ECIR) 2003. —pp. 502−510.
- Hiyan Aishawi. Processing dictionary definitions with phrasal pattern hierarchies. // Computational Linguistics, vol. 13 (1987), pp. 195−202.
- Aitchison J. Thesaurus Construction and Use: A Practical Manual. — Routledge, 4 edn. 2002.
- Alavi Maryam- Leidner Dorothy E. Knowledge management systems: issues, challenges, and benefits. // Communications of the AIS 1 (2), 1999 — URL .
- Alavi Maryam- Leidner Dorothy E. Review: Knowledge Management and Knowledge Management Systems. // Conceptual Foundations and Research Issues, 2001. — URL .
- Anderberg M.R. Cluster Analysis for Applications. — Academic Press, New York, 1973.
- Architecture for Text Engineering. // In Proceedings of the 16th Conference on Computational Linguistics (COLING96), 1996.
- Baeza R., Ribeiro B. Modern Information Retrieval. 1999.
- Berners-Lee Tim, James Hendler and Ora Lassila. The Semantic Web. // Scientific American Magazine. May Issue, 2001. — URLhttp://www.sciam.com/article.cfm?id=the-semantic-web&print=true>. Retrieved March 26,2008.
- Boguraev B., Briscoe T. Large lexicons for natural language processing: Utilising the grammar coding system of LDOCE. // Computational Linguistics, 1987.
- Berners-Lee Tim, Fischetti Mark. Weaving the Web. — Harper SanFrancisco. chapter 12. 1999.
- Bloehdorn S., Hotho A. Text classification by boosting weak learners based on terms and concepts. // In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM), 2004, pp. 331−334.
- Brewster C., Ciravegna F., Wilks Y. Background and foreground knowledge in dynamic ontology construction. // In Proceedings of the Semantic Web Workshop SIGIR'03, Toronto, Canada, 2003.
- Bruce G. Buchanan and David C. Wilkins (editors). Readings in Knowledge Acquisition and Learning: Automating the Construction and Improvement of Expert Systems. San Mateo: Morgan Kaufmann, 1993.
- Canter D., Rivers R, and Graham Storrs. Characterizing user navigation through complex data structures. // Behavior and Information Technology, Vol. 4, No. 2, 1985.—pp. 93−102.
- Caraballo S.A. Automatic Acquisition of a Hypernym-Labeled Noun Hierarchy from Text. — Brown University. Ph.D. Thesis. 2001.
- Celson Lima, Alain Zarli, Graham Storer, Jaime Acevedo-Alvarez. A Historical Perspective on the Evolution of Controlled Vocabularies in Europe. // Complex Systems Concurrent Engineering, 2007 — Springer.
- Claudio Carpineto, Giovanni Romano. Concept Data Analysis: Theory and Applications // Published Online: 13 SEP 2005.
- Cederberg S., Widdows D. Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction. // Proc. of CoNLL-2003, 2003. — pp. 111−118.
- Charniak E., Berland M. Finding parts in very large corpora. // In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), 1999.—pp. 57−64.
- Ciaramita M., Hofmann T., Johnson M. Hierarchical Semantic Classification: Word Sense Disambiguation with World Knowledge. // Proc. of IJCAI, 2003, pp. 817−822.
- Ciaramita M. & Johnson M. Supersense Tagging of Unknown Nouns inWordNet. // In Proceedings of EMNLP-03, Sapporo, Japan, 2003. — pp 168−175.
- Cimiano P., Pivk A., Schmidt-Thieme L., & Staab S. Learning Taxonomic Relations from Heterogeneous Evidence. // In Ontology Learning from Text: Methods, Applications and Evaluation, IOS Press, 2005. —pp. 59−73.
- Cimiano P., Pivk A., Schmidt-Thieme L., & Staab S. Learning taxonomic relations from heterogeneous sources of evidence. // In Proceedings of the ECAI 2004 Ontology Learning and Population Workshop, 2004.
- Cimiano P. and Staab S. Learning by Googling. // SIGKDD Explorations, Volume 6, Issue 2, 2004. — pp. 24−34.
- Cimiano P., Staab S., & Tane J. Automatic acquisition of taxonomies from text: FCA meets NLP. // In Proceedings of the PKDD/ECML'03 International Workshop on Adaptive Text Extraction and Mining (ATEM), 2003. — pp. 10−17.
- Cimiano P. and Staab S. and Tane J. Deriving Concept Hierarchies from Text by Smooth Formal Concept Analysis // In Proceedings of the GI Workshop Lehren -Lernen Wissen — Adaptivit (LLWA), 2003 — pp. 72−79.
- Cluster Analysis. — URL .
- Clustering Algorithm Details. — URL
- Comez-Perez A, Juristo N, Montes C, Pazos J. Ingenieria del Conocimiento: Diseno y Construction de Sistemas Expertos.// Ceura, Madrid, Spain, 1997.
- Corcho O. and Gomez-Perez A. Evaluating Knowledge Representation and Reasoning Capabilities of Ontology Specification Languages. // In Proceedings of the ECAI 2000 Workshop on Application of Ontologies and Problem-Solving Methods, Berlin, 2000.
- Doan A., Madhavan J., Domingos P. and Halevy P. Ontology matching: A machine learning approach. // In S. Staab and R. Studer, editors, Handbook on Ontologies in Information Systems. Springer-Velag, 2003.
- Duffy T.M. Technical Manual Production: An Examination of Four Systems. // CDC Technical Report No. 19. Carnegie- Mellon University, Pittsburg, 1985.
- Etzioni O., Cafarella M., Downey D., Kok S., Popescu A., Shaked T., Soderland S., Weld D., and Yates A. Web-scale information extraction in knowitall. // In Proceedings of WWW-04,2004.
- Etzioni O., Cafarella M., Downey D., Popescu A., Shaked T., Soderland S., Weld D., and Yates A. Methods for domain-independent information extraction from the web: An experimental comparison. // In Proceedings of AAAI-2004, 2004.
- Martha W. Evens. Structuring the lexicon and the thesaurus with lexical-semantic relations. // Final report to the National Science Foundation on grant 1ST-1981.
- Faure D. and N’edellec C. ASIUM: Learning subcategorization frames and restrictions of selection. // In 10th Conference on Machine Learning Workshop on Text Mining, Chemnitz, Germany, 1998.
- Fellbaum C. WordNet: «An Electronic Lexical Database». — Cambridge, MA: MIT Press, 1998.
- Ferligoj A. Razvrscanje v skupine. Teorija in uporaba v druzboslovju. — Metodoloski zvezki, 4, Ljubljana, 1989.
- Field A. Cluster Analysis. — 2000.
- Finding the Similarities Between Objects — URL .
- Firth J.R. A synopsis of linguistic theory 1930−1955. // In Studies in Linguistic Analysis, pp. 1−32. Oxford: Philological Society. Reprinted in F.R. Palmer (ed.), Selected Papers of J.R. Firth 1952−1959, London: Longman (1968).
- Firth J. R. Firth developed a particular view of linguistics that has given rise to the adjective. — 1957.
- Hermine Njike Fotzo, Patrick Gallinari. Information Access via Topic Hierarchies and Thematic Annotations from Document Collections. // ICEIS (2) 2004. — pp. 69−76
- Gan G., Ma C. and Wu J. Data Clustering: Theory, Algorithms and Applications // ASA-SIAM Series on Statistics and Applied Probability, 2007.
- Ganter B. and Wille R. Mathematical Foundations. — Springer, Berlin, 1999. cited at p. 39, 40.
- Ganter B., & Wille. Formal Concept Analysis. Mathematical. Foundations. Berlin: Springer., R. (1999b). Contextual Attribute Logic. In W. Tepfen
- Garey M. and Johnson D. Computers and Intractability: A Guide to the Theory of NPcompleteness. — Freeman and Co., 1979.
- Gerber A.J., Barnard A. & Van der Merwe, Alta. A Semantic Web Status Model. // Integrated Design & Process Technology, Special Issue: IDPT, 2006
- Gerber A., Van der Merwe A., Barnard A. A Functional Semantic Web architecture. // European Semantic Web Conference 2008, ESWC'08, Tenerife, June 2008.
- Girju R., Badulescu A., and Moldovan D. Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations. // In the Proceedings of the Human Language Technology Conference (HLT), 2003.
- Grefenstette G. Evaluation techniques for automatic semantic extraction: Comparing syntactic and window-based approaches. // In Proceedings of the Workshop on Acquisition of Lexical Knowledge from Text, 1992
- Grefenstette G. Explorations in Automatic Thesaurus Construction. — Kluwer, 1994.
- Gordon A.D. A Survey of Constrained Classification. // Computational Statistics & Data Analysis, 21,1996. — pp. 17−29.
- Gordon A.D. Classification (Second edition). — Chapman and Hall/CRC, Boca Raton. 1999. —256 pp.
- Gruber Th. What is an Ontology — URL
- Guarino N. Understanding, Building, and Using Ontologies. — URL
- Harris Z. Distributional structure. Word 10 (23), 1954. — pp. 146−162.
- Harris Z. Mathematical Structures of Language. Wiley. 1968.
- Hearst M. Automatic acquisition of hyponyms from large text corpora. // In Proceedings of the 14th International Conference on Computational Linguistics (COLING), 1992. — pp. 539−545.
- Herman Ivan. «W3C Semantic Web Activity». W3C. — URL. Retrieved March 13,2008.
- Hierarchical Clustering — URL .
- Hotho A., Staab S., & Stumme G. Ontologies improve text document clustering. 11 In Prodeedings of the IEEE International Conference on Data Mining (ICDM), 2003.—pp. 541−544.
- Hovy E. A Standard for Large Ontologies. — URL .
- IR Multilingual Resources at UniNE — URL
- Iwanska L., Mata N., & Kruger K. Fully automatic acquisition of taxonomic knowledge from large corpora of texts. // In Iwanksa L., & Shapiro S. (Eds.), Natural Language Processing and Knowledge Processing, MIT/AAAI Press, 2000 — pp. 335−345.
- Jasper R. On bigrams for text categorization. — DDLbeta newsgroup, 2003
- Joining Clusters: Clustering Algorithms. — URL .
- Kalyanpur A. et al. OWL: «Capturing Semantic Information using a Standardized Web Ontology Language». // Multilingual Computing & Technology Magazine, Vol. 15, issue 7, Nov 2004. — URL http://www.mindswap.org/papers/MultiLing.pdf
- Lin D., Pantel P. Concept Discovery from Text — URL http://www.patrickpantel.com/Download/Papers/2002/coling02.pdf
- Mandelbrot B. Information Theory and Psycholinguistics. // In B.B. Wolman and E. Nagel. Scientific psychology, 1965.
- Mandelbrot B. Information Theory and Psycholinguistics. In R.C. Oldfield and J.C. Marchall. Language. Penguin Books, 1968.
- Markert K., Modjeska N., & Nissim M. Using the web for nominal anaphora resolution. // In EACL Workshop on the Computational Treatment of Anaphora, 2003.
- Mathworks: Accelerating the pace of engineering and science — URL
- McDonald S., and Ramscar M. Testing the distributional hypothesis: The influence of context on judgements of semantic similarity. // In Proceedings of the 23rd Annual Conference of the Cognitive Science Society, 2001. — pp. 611−616.
- Miller and Charles, 1991. — URL
- Mucha H-J. and Sofyan H. Cluster Analysis. 2003. — URL
- Muegge Uwe. Disciplining words: What you always wanted to know about terminology management. 2007. — URL
- Nanni M. Speeding-up hierarchical agglomerative clustering in presence of expensive metrics. // PAKDD 2005
- Pasca M. Weakly-Supervised Discovery of Named Entities Using Web Search Queries. // Proceedings of the 16th ACM Conference on Information and Knowledge Management (CIKM-2007), 2007. — pp. 683−690.
- Paolillo J., Pimienta D., Prado D. A collection of papers et «Edited with an introduction by UNESCO Institute for Statistics Montreal», Canada.
- Roberts Matt T. Bookbinding and the Conservation of Books. A Dictionary of Descriptive Terminology. ISBN-0−8444−0366−0, 1982. — 318 p.
- Popescu A.-M., Etzioni A.Y. Class Extraction from theWorld WideWeb. 2004
- Ricardo & Berthier. Modern Information Retrieval. ACM Press / Addison-Wesley. C. Buckley, et al. (1994).
- Rosch Eleanor. Prototype classification and logical classification: the two systems. // Ellin Scholnick (ed), New Trends in Conceptual Representation. Hillsdale, N.J.: Erlbaum, 1981. — pp. 73−85.
- Ryu P., Choi K. Taxonomy Learning using Term Specificity and Similarity. // In Proceedings of the 2nd Workshop on Ontology Learning and Population, 2006. — pp 41−48.
- Sanderson M., & Croft B. Deriving concept hierarchies from text. // In Research and Development in Information Retrieval, 1999. —pp. 206−213.
- Schnittker J. Cluster Analysis Presentation. — URL
- Sebastien Ferre, Rudolph Sebastian (Eds.). Formal Concept Analysis. 2009, XII, 341 p.
- Shannag A.N., Yusupov R., Alexandrov V. Student Relationship in Higher Education Using Data Mining Techniques. // Global Journal of Computer Science and Technology — vol. 10, Issue 11 (Ver. 1.0), October 2010 — p. 71−76.
- Shannaq B., Alexandrov V. Clustering the Arabic Documents (CAD). // Universal Journal of Applied computer Science and Technology (UNIASCIT), Vol 1 (1), 2011.
- Shannaq B. Diagonal Name Search For Arabic (DNSA). // First E-Technologies and Environment Conference (ETEC08) 15−16 April, 2008 Sohar, Oman.
- Shannaq B. Language Independent Product Name Search (LIPNS). // First IEEE International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2008), VSB- Technical University of Ostrava, Czech Republic August 4- 6, 2008.
- Shannaq B., Arockiasamy S., John D Haynes. Strategic Rating Factors for Finding the Richness of Text in Different formats for Arabic and English Text. // International Information Systems Conference (iSC), 2011.
- Shannaq B., Kaneez F. On the development of Arabic, English glossaries in business tourism. // ACIT'2011, The International Arab Conference on Information Technology, Naif Arab University for Security Science (NAUSS), Riyadh, Saudi Arabia, 2011.
- Shannaq B., Kaneez F. Predicting Consumer Buying Behavior Pattern using Classification Technique. // International Information Systems Conference (iSC), 2011.
- Sheikholeslami G., Chatteijec S., and Zhang A. WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases. 1998. — URL
- SPSS Statistical Algorithms. SPSS Inc., 1985.
- Tan M., Wang Y. F., and Lee C. D. The use of bigrams to enhance text categorization. // Information Processing and Management, 38(4):529−546, 2002.
- Technical writing — URL
- The Consulate General of the Sultanate of Oman Australia — URL
- The Distributional Hypothesis. // Rivista di Linguistica (Italian Journal of Linguistics), 20 (1).
- The University of Sussex — URL
- Vapnik V. N. Statistical Learning Theory. John Wiley & Sons Inc., New York, 1998.
- Vapnik V. The nature of statistical learning theory, Springer-Verlag, NY, USA, 1995.
- Voorhees E. Query expansion using lexical-semantic relations. // In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information. Retrieval, 1994. —pp. 61−69.
- Walker D. Query Expansion using Thesauri: Previous Approaches and Possible New Directions. University of California, Los Angeles, 2001.
- Webb Andrew R. Statistical pattern recognition.
- West D. B. Introduction to Graph Theoiy. Second Edition, Prentice-Hall, 2001.
- Wille R. Introduction to formal concept analysis. // In G. Negrini. (Ed.), Modelli e modellizzazione. Models and modelling. 1997.
- Yarlett D. Language Learning Through Similarity-Based Generalization. PhD Thesis, Stanford University, 2008
- Zho Y. & Karypis G. Hierarchical Clustering Algorithms for Document Datasets. // Data Mining and Knowledge Discovery, Vol. 10 No. 2, March 2005, pp. 141 168.