Text Files versus Binary Files
Text
files have two characteristics that binary files don't possess;
Text files are divided into lines. Each line in a text file
normally ends with one or two special characters; the choice of characters
depends on the operating system. In Windows, the end-of-line marker is a
carriage-return character ('\xOd')
followed immediately by a line-feed character ('\xOa'). In UNIX and newer
versions of the Macintosh operating system (Mac OS), the end-of-line marker is
a single line-feed character. Older versions of Mac OS use a single
carriage-return character.
Text files may contain a special "end-of-file"
marker. Some operating systems allow a special byte to be used as a marker at
the end of a text file. In Windows, the marker is ' \xla’ (Ctrl-Z).
There's no requirement that Ctrl-Z be present, but if it is, it marks the end
of the file; any bytes after Ctrl-Z
are to be ignored. The Ctrl-Z convention is a holdover from DOS. which in turn
inherited it from CP/M, an early operating system for personal computers. Most
other operating systems, including UNIX, have no special end-of-file character.
Binary files aren't divided into lines. In a binary file,
there are no end-of-line or end-of-file markers; all bytes are treated equally.
When
we write data to a file, we'll need to consider whether to store it in text
form or in binary form. To see the difference, consider how we might store the
number 32767 in a file. One option would be to write the number in text form as
the characters 3, 2, 7, 6, and 7. If the character set is ASCII, we'd have the
following five bytes:
3 2 7 6 7
00110011
|
00110010
|
00110111
|
00110110
|
00110111
|
The
other option is to store number in binary.which would take as few as two bytes.
(The
bytes will be reversed on systems that store data in little-endian order.) As
this example shows, storing numbers in binary can often save quite a bit of
space.
When
we're writing a program that reads from a file or writes to a file, we need to
take into account whether it's a text file or a binary file. A program that displays
the contents of a file on the screen will probably assume it's a text tile. A
file copying program, on the other hand, can't assume that the file to be
copied is a text file. If it does, binary files containing an end-of-file
character won't be copied completely. When we can't say for sure whether a file
is text or binary, it's safer to assume that it's binary.
No comments:
Post a Comment