Calgary Corpus


Calgary Corpus (corpus calgary) - a set of text and binary files used to test compression algorithms.

It was created by Ian Witten and Tim Bell in the 1980s and was commonly used in the 1990s. In 1997 it was replaced by the Canterbury Corpus, but the Calgary Corpus is still there for comparison and is still useful. Its main advantage is ease of comparison with algorithms for which results are known for this body.

Despite its popularity, it's a bit outdated, its files are small in size and some of them are in unused formats.

wiki

Comments

Popular posts from this blog

Association of Jewish handicrafts "Jad Charuzim"

Grouping Red Arrows

Catechism of Polish Child