Методика оперативного сжатия документов формата XML на основе декомпозиции иерархической модели данных
Диссертация
В третьей главе доказано влияние иерархических зависимостей вложенных элементов в структуре XML на точность прогнозирования символов по алгоритму РРМ. Описана разработанная методика оперативного сжатия документов XML. Предложен метод предобработки, частично устраняющий избыточность, основанный на декомпозиции иерархической модели данных XML, позволяющей подготовить входной поток данных… Читать ещё >
Список литературы
- ТРС-Н XML-отчеты (выборка) баз данных низкая 34
- Weblog Данные журналов событий Web-серверов высокая 30
- SwissProt Последовательности кодов ДНК низкая 21
- Extensible Markup Language (XML) 1.0 (Second Edition) W3C Recommendation. (http://www.w3.org/TR/REC-xml/). October 2000.
- Document Object Model (DOM) Level 2 Specification Version 1.0, W3C Recommendation (http://www.w3 .org/TR/2000/REC-DOM-Level-2-Core-20 001 113). November 2000.3.
- Д. А. Шкарин. Повышение эффективности алгоритма РРМ // Проблемы передачи информации. 2001. Т. 37. Вып. 3. С 44−54.
- Augeri С. J., Mullins В. Е., Bulutoglu D. A., Baldwin R. О. An Analysis of XML Binary Formats and Compression // ExpCS '07. San Diego, California, USA. June 13- 14, 2007.
- Augeri C. J., Mullins В. E., Bulutoglu D. A., Baldwin R. O. An Analysis of XML Compression Efficiency // Proceedings of the 2007 workshop on Experimental computer science. New York, NY, USA. Article No. 7. 2007.
- Ng Lam W., Yeung W., Cheng J. Comparative Analysis of XML Compression Technologies // Kluwer Academic Publishers, Hingham, MA, USA. 2006. P. 5−33
- H. Liefke, D. Suciu. XMill: an efficient compressor for XML data // Proceedings of ACM SIGMOD international conference on Management of data, May 15−18, 2000, Dallas, Texas, United States. P.153−164.
- C. League, K. Eng. Schema-Based Compression, of XML Data with Relax NG // Long Island University Computer Science, Brooklyn, NY, USA. 2006.
- League, C. Eng, K. Type-Based Compression of XML Data // Data Compression Conference, 2007. P. 273−282.
- J. Min, M. Park, C. Chung. A compressor for effective archiving, retrieval, and updating of XML documents // ACM Transactions on Internet Technology. Volume 6, Issue 3. August 2006. P. 223 258.
- J. Min, M. Park, C. Chung. XPRESS: a queriable compression for XML data // Proceedings of the 2003 ACM SIGMOD international conference on Management of data table of contents San Diego, California. 2003. P. 122−133.
- P. Skibinski, J. Swacha, S. Grabowski. Effective asymmetric XML compression // University of Wroclaw, Institute of Computer Science, Poland. 2007.
- P. Skibinski, J. Swacha, S. Grabowski. Combining Efficient XML Compression with Query Processing // University of Wroclaw, Institute of Computer Science, Poland. 2007.
- P. Skibinski, J. Swacha, S. Grabowski. A Highly Efficient XML Compression Scheme for the Web // University of Wroclaw, Institute of Computer Science, Poland. 2008.
- S. Hariharan, P. Shankar. Compressing XML Documents Using Recursive Finite State Automata // Department of Computer Science and Automation, Indian Institute of Science, India. 2005.
- S. Hariharan, P. Shankar. Evaluating the Role of Context in Syntax Directed Compression of XML Documents // Data Compression Conference. 2006.
- S. Harrusi, A. Averbuch, A. Yehudai. Compact XML grammar based compression // School of Computer Science, Tel Aviv University. Israel. 2007.
- Leighton G., Diamond J., Muldner Т. AXECHOP: a grammar-based compressor for XML // Data Compression Conference, 2005. P. 467.
- Leighton G., Diamond J., Muldner T. Treechop: A tree-based query-able compressor for XML // In Proceedings of the Ninth Canadian Workshop on Information Theory. 2005.
- W. Ng, W.-Y. Lam, P. T. Wood, M. Levene. XCQ: A queriable XML compression system Knowledge and Information Systems, Volume 10, Issue 4. October 2006. P 421−452.
- W. Ng, X. Wang, J. He, A. Jhou. MQX: multi-query engine for compressed XML data // ACM Special Interest Group on Information Retrieval, New York, NY, USA. 2007. P. 897.
- Y. Natchetoi, H. Wu, G. Babin, S. Dagtas. EXEM: Efficient XML data exchange management for mobile applications // Information Systems Frontiers, Volume 9, Number 4, 2007. Springer Netherlands. P. 439−448.
- M. Girardot, N. Sundaresan. Efficient Representation and Streaming of XML Content over the Internet Medium // IEEE International Conference on Multimedia and Expo (I), pp. 67−70 (2000).
- M. Girardot, N. Sundaresan. Millau: An Encoding Format for Efficient Representation and Exchange of XML over the Web // Proceedings of the 9th International WWW Conference, pp. 747−765, May (2000).
- P. M. Tolani, J. R. Haritsa. XGRIND: A Query-friendly XML Compressor // IEEE Proceedings of the 18th International Conference on Data Engineering (2002).
- A. R. Schmidt, F. Waas, M. L. Kersten and M. J. Carey and I. Manolescu and R. Busse. XMark: A Benchmark for XML Data Management // Proceedings of VLDB, (2002).
- M. Levene, P. T. Wood. XML Structure Compression //Proceedings of the Second International Workshop on Web Dynamics, May (2002).
- M. Kalman, F. Havasi, T. Gyimothy. Compacting XML documents // Information and Software Technology, Volume 48, Issue 2. 2006. P. 90−106.
- P. Buneman, M. Grohe, C. Koch. Path Queries on Compressed XML // Proceedings of the 29th International Conference on Very Large Data Bases (VLDB'03), May (2003).
- A. Arion, A. Bonifati, G. Costa, I. Cnr, I. Manolescu, A. Pugliese, D. Unical. XQueC: Pushing Queries to Compressed XML Data // The Pennsylvania State University CiteSeer Archives. 2003.
- H. Liefke, D. Suciu. An extensible compressor for XML data // ACM SIGMOD Record, Volume 29, Issue 1. 2000. P.57−62.
- XML Binary Characterization // www. w3.org/XML/Binary.
- J. Ziv, A. Lempel. A Universal Algorithm for Sequential Data Compression 11 IEEE Trans. Inform. Theory, IT-23, no.3. May 1977. P. 337−343.
- J. Ziv, A. Lempel. Compression of Individual Sequences via Variable Rate Coding//IEEE Trans. Inform. Theory, IT-24, no. 5. Sept. 1978. P. 530−536.
- M. Burrows, D.J. Wheeler. A Block-Sorting Lossless Data Compression Algorithm // Technical Report, Digital Equipment Corporation. 1994. Palo Alto, California.
- Cleary J.G., Witten LH. Data Compression Using Adaptive Coding and Partial String Matching// IEEE Trans. Commun. 1984. V. 32. № 4. P. 396−402.
- J. Cleary, W. Teahan, I. Witten. Unbounded Length Contexts for PPM // Proceeding of the IEEE Data Compression Conference. March 1995. P. 52−61.
- J. G. Clearly, I. H. Witten. Data Compression Using Contexts for PPM // Computer Journal, Vol. 40, Nos (2/3). 1997. P. 67−75.
- J. A. Storer, T. G. Szymanski: Data compression via textural substitution // ACM 29(4). 1982. P. 928−951.
- Z. Arnavutl, S. S. Magliveras. Lexical Permutation Sorting Algorithm // The Computer Journal, 1997, 40(5). P. 292−295.
- Ryabko, B. Ya. Data compression by means of a «book stack» // Problems Inform. Transmission 16, 1980, no. 4. P. 265−269.
- I. H. Witten, Т. C. Bell. The Zero Frequency Problem: Estimating the Probabilities of Novel Events in Adaptive Text Compression // IEEE Trans. Inform. Theory, IT-37, no. 4. July 1991. P. 1085−1094.
- Moffat A. Implementing the PPM Data Compression Scheme // IEEE Trans. Commun. 1990. V. 38. № 11. p. 1917−1921.
- Howard P. G. The design and analysis of efficient lossless data compression systems // PhD thesis, Brown University. 1993.
- P. G. Howard, J. S. Witter. Practical Implementations of Arithmetic Coding // in Image and Text Compression, J. A. Storer, Ed. Norwell, MA: Kluwer Academic Publishers. 1992. P. 85−112.
- P. G. Howard, J. S. Witter. Design and Analysis of Fast Text Compression Based on Quasi-Arithmetic Coding // in Proc. Data Compression Conference, J. A. Storer and M. Cohn, Eds. Snowbird, Utah. Mar. 30-Apr. 1, 1993. P. 98−107.
- P. G. Howard, J. S. Witter. Analysis of Arithmetic Coding for Data Compression // Information Processing and Management, 28, no. 6, pp. 749−763, 1992.
- I. H. Witten, R. M. Neal, J. G. Cleary. Arithmetic Coding for Data Compression // Comm. ACM, 30, no. 6. June 1987. P. 520−540.
- A. M. Moffat, N. Sharman, I. H. Witten, Т. C. Bell. An Empirical Evaluation of Coding Methods for Multi-Symbol Alphabets // in Proc. Data Compression Conference, J. A. Storer and M. Cohn, Eds. Snowbird, Utah. Mar. 30-Apr. 1, 1993. P. 108−117.
- Hufman D. A. A Method for the Construction of Minimum Redundancy Codes // Proceedings of the Institute of Radio Engineers, 40. 1952. P. 1098−1101.
- J. J. Rissanen. Generalized Kraft Inequality and Arithmetic Coding // IBM J. Res. Develop., 20, no. 3., May 1976. P. 198−203.
- F. Rubin. Arithmetic Stream Coding Using Fixed Precision Registers // IEEE Trans. Inform. Theory, IT-25, no. 6. Nov. 1979. P. 672−675.
- J. J. Rissanen, G. G. Langdon. Arithmetic Coding // IBM J. Res. Develop., 23. no. 2. Mar. 1979. P. 146−162.
- XMLZip XML Solutions, http://www.xmls.com/.
- Aberg J., Shtarkov Yu.M. and Smeets BJ.M. Multialphabet coding with separate alphabetdescription, // Proc. of Compression and Complexity of Sequences 97, Positano, Salerno, Italy, IEEE Сотр. Soc. Press, 1998. P. 56 65.
- Bloom C. Solving the Problems of Context Modeling // (http://www.cbloom.com/papers/). 1996.
- Seward J. BZip2 v. 1.0 block-sorting file compressor (http://www.muraroa.demon.co.uk/). 2000.