Papers

Title: XLARGE: an efficient XML Compressor for simple, large and repetitive document
Year of Publication: 2015
Publisher: International Journal of Computer Systems (IJCS)
ISSN: 2394-1065
Series: Volume 2, Number 11
Authors: Debashish Roy

Citation:

Debashish Roy, “XLARGE: an efficient XML Compressor for simple, large and repetitive document ", International Journal of Computer Systems (IJCS), 2(11), pp: 464-468, November 2015. BibTeX

@article{key:article,
	author = { Debashish Roy },
	title = {Article: XLARGE: an efficient XML Compressor for simple, large and repetitive document },
	journal = {International Journal of Computer Systems (IJCS)},
	year = {2015},
	volume = {2},
	number = {11},
	pages = {464-468},
	month = {November}
	}

Abstract

In Last few years XML has became standard of data interchange in most of the applications over the web. Most of the Enterprise level application, Middleware software adopted XML as their standard medium for data interchange. So in these days of Internet and web technology popularity of XML is huge. The power of XML lies in its self describing abilities. This same ability makes XML verbose thus introducing significant amount of redundancy that adds no particular value. Increased size of XML documents create burden on application and network bandwidth. So there is a clear requirement for good and effective data compression algorithm for XML. Traditional ZIP utility like GZIP compressed the XML data in a non readable format means we can't query those data until it's decompressed. So in this paper I had developed an efficient XML compression utility named as XLARGE. XLARGE has shown a significant amount of improvement in compression ratio over GZip.

References

[1] C.J. Augeri, B.E. Mullins, L.C. Baird, D.A. Bulutoglu, and R.O. Baldwin. An Analysis of XML Compression Efficiency. In Proceedings of the 2007 Workshop on Experimental Computer Science (ExpCS '07), 2007. Additional information about the study, including links to the XML corpus used in the paper, is available at http://www.chris-augeri.com/docs/xml_compress.htm.
[2] Sherif Sakr. XML compression techniques: A survey and comparison. Journal of Computer and System Sciences, 75 (2009), pp: 303–322.
[3] http://en.wikipedia.org/wiki/Gzip
[4] Xmill: An Efficient Compressor for XML, http://www.liefke.com/hartmut/xmill/xmill.html
[5] Wilfred Ng Lam, Wai Yeung ,James Cheng. Comparative Analysis of XML Compression Technologies, http://www.cs.ust.hk/~wilfred/paper/wwwj05.pdf
[6] Pankaj M. Tolani, Jayant R. Haritsa.XGRIND: A Query-friendly XML Compressor. paper is available at http://reference.kfupm.edu.sa/content/x/g/xgrind__a_query_friendly_xml_compressor__87177.pdf
[7] D. Huffman, “A Method for Construction of Minimum-Redundancy Codes”, Proc. of IRE, September 1952.
[8] J. McHugh, et al. “Indexing Semi-structured Data”,Technical Report, Computer Science Dept., Stanford University, January 1998.
[9] http://www.ebi.ac.uk
[10] Smitha S. Nair. XML Compression Techniques: A Survey. https://people.ok.ubc.ca/rlawrenc/research/Students/SN_04_XMLCompress.pdf

Keywords

XML, Data Compression, Web Technology. .