Lesson Plan

IMOS Home Student Center TallTech Home
 

October 12, 1999

Details

Title

Information Resources on the Web
Compressing files with WinZip
File Types

Time Allotment

3 hrs

Reading / References

Read chapter 5 from Internet Comprehensive
Additional references/information (from book source) can be found at 
    http://www.course.com/downloads/NewPerspectives/internet/cmp/t05.html

References on TallTech WWW

    Common Internet File Formats (HTM) - fairly complete list of formats out there

Documents on TallTech FTP 

Please note: Some documents are a little dated, but conceptually the same.

Objectives

Coursework / Lab Work

Home Work

Demo

Information Resources

WINZIP

Supplemental Information

Text vs. Binary

With text files, most text editors can be used to access the file.

With binary files, programs specifically used to read the binary must be used.

Binary files can take the form of compressed files, executable files, sound files, graphic files, word processing documents, spreadsheets, etc.

File Compression

Large files, which may be graphics, audio, video or text information, often take several minutes to transfer or download over the Internet. To make more efficient use of space and to speed things up, most large files are compressed. This can often cut download or transfer time by as much as half.

How does compression work? Compression software uses complex mathematical equations to scan a file for repeating patterns in the data. It replaces the repeating data with smaller codes that take up fewer bytes. For example, one way compression software works is to replace repeating text characters with a code and notes the code's positions in the document.

Compression is used to reduce the amount of space required to store information, and to reduce the amount of time required to download data. There are two types of compression: "lossy" and "lossless." Lossy compression is usually very high data reduction, but loses some of the data quality upon de-compression by the receiving computer. Lossless compression uses very careful mathematical rules (called "algorithms") to reorganize the data so that it can be accurately de-compressed. 

There are a number of common compression types used for different data types:

JPEG

The Joint Photographic Experts Group standard is used to compress still images. Known limitations of the human eye, that color details are less important than contrast (light-dark) and let those details be compressed to a greater degree. JPEG is best for photos with subtle shading.

GIF

General Image Format, created by CompuServe to store images, reduces color bit depth to 8 bits (256 colors). GIF is best for line art or simple "cartoon" images.

Archives

These programs take several input files (even entire data directories), compress them and produces a single archive file. Examples are ARC, ARJ, LHA, ZIP, ZOO.

LZW

Based on the work of pioneers Lempel, Ziv, and Welch at Unisys. (see http://www.cis.ohio-state.edu/hypertext/faq/usenet/compression-faq/ for details on the patents). A variation of LZW is used in the V.42bis modem standard for data transmission.

MPEG

Moving Picture Experts Group standards for video, not only compresses each frame, but also predicts frame-to-frame images by ignoring redundant non-changing pixels, and storing only variations between frames

Archives

ZIP Files PKZIP / WinZip
Compressed Files (.Z) Unix Compressed File
TAR Files Unix program takes separate files and makes one file.
LZH Files LHA Archive

There are three main types of compression used on Unix systems: zip, compress, tar.

Take the  example (filename.tar.Z) you would need to undo the compression from left to right. The '.Z' indicates compress via 'compress', you'd enter 'uncompress filename.tar.Z' to undo it. You should be left with 'filename.tar'. The '.tar' indicates the use of 'tar' (Unixesse for Tape ARchive). To undo this enter 'tar -xf filename.tar'. 'tar' is typically used to bundle a series of files so undoing '.tar' will likely result in the creation of a bunch of files.

The zip compression is indicated by a '.zip' extension, undone with 'unzip'. A closely related, in fact compatible, command is 'pkzip'. This is a public domain version found frequently on IBM and compatible platforms.