Dynamically
Allocated Strings
Dynamic
storage allocation is often useful for working with strings. Strings are stored
in character arrays, and it can be hard to anticipate how long these arrays
need to be. By allocating strings dynamically, we can postpone the decision
until the program is running.
Using malloc to Allocate Memory for
a String
The malloc function has the
following prototype:
void *malloc(size_t size);
malloc
allocates a block of size bytes and returns a pointer to it. Note that size has
type size_t, an unsigned integer type defined in the C library. Unless we're
allocating a very large block of memory, we can just think of size as an
ordinary integer.
Using
malloc to allocate memory for a string is easy, because C guarantees that a
char value requires exactly one byte of storage (sizeof (char) is 1, in other
words). To allocate space for a string of n characters, we'd write
p = malloc(n + 1);
Using
Dynamic Storage Allocation in String Functions
Dynamic
storage allocation makes it possible to write functions that return a pointer
to a '"new" string a string that didn't exist before the function was
called. Consider the problem of writing a function that concatenates two
strings without changing either one. C's standard library doesn't include such
a function (strcat isn't quite what we want, since it modifies one of the
strings passed to it), but we can easily write our own.
Our function will
measure the lengths of the two strings to be concatenated, then call malloc to
allocate just the right amount of space for the result. The function next
copies the first string into the new space and then calls strcat to concatenate
the second string.
char *concat(const char *sl, const
char *s2)
{
char
*result;
result =
malloc(strlen(s1) + strlen(s2) + 1);
if (result
== NULL) {
printf("Error:
malloc failed in concat\n");
exit(EXIT_FAILURE);
}
strcpy(result, s1); strcat(result,
s2); return result;
}
If malloc returns a null pointer, concat prints an error message and
terminates the program. That's not always the right action to take: some
programs need to recover from memory allocation failures and continue running.
Here's how the concat function might be called:
p = concat("abc",
"def");
After the
call, p will point to the string "abcdef". which is stored in a
dynamically allocated array. The array is seven characters long, including the
null character at the end.
Functions such as concat that dynamically
allocate storage must be used with care. When the string that concat returns is
no longer needed, we'll want to call the free function to release the space
that the string occupies. If we don't, the program may eventually run out of
memory.
Dynamically Allocated Arrays
Dynamically
allocated arrays have the same advantages as dynamically allocated strings (not
surprisingly, since strings are arrays). When we're writing a
program, it's often difficult to estimate the proper size for an array; it
would be more convenient to wait until the program is run to decide how large
the array should be. C solves this problem by allowing a program to allocate
space for an array during execution, then access the array through a pointer to
its first element. The close relationship between arrays and pointers. makes a
dynamically allocated array just as easy to use as an ordinary array.
Although malloc can allocate space
for an array, the calloc function is sometimes used instead, since it
initializes the memory that it allocates. The realloc function allows us to
make an array "grow" or "shrink" as needed.
Using
malloc to Allocate Storage for an Array
We can
use malloc to allocate space for an array in much the same way we used it to
allocate space for a string. The primary difference is that the elements of an
arbitrary array won't necessarily be one byte long, as they are in a string. As
a result, we'll need to use the sizeof operator to calculate the amount of
space required for each element.
Suppose we're writing
a program that needs an array of n integers, where n is to be computed during
the execution of the program. We'll first declare a pointer variable:
int *a;
Once the
value of n is known, we'll have the program call malloc to allocate space for
the array:
a = malloc(n * sizeof(int));
Always
use sizeof when calculating how much space is needed for an array. Failing to
allocate enough memory can have severe consequences. Consider the following
attempt to allocate space for an array of n integers:
a = malloc(n * 2);
If int
values are larger than two bytes (as they are on most computers), malloc won't
allocate a large enough block of memory. When we later try to access elements
of the array, the program may crash or behave erratically.
Once it
points to a dynamically allocated block of memory, we can ignore the fact that
a is a pointer and use it instead as an array name, thanks to the relationship between arrays and pointers in C. For example, we could
use the following loop to initialize the array that a points to:
for (i = 0; i < n; i++)
a[i] = 0;
We also have the option of using
pointer arithmetic instead of subscriptine to access the elements of the array.
The calloc Function
Although
the malloc function can be used to allocate memory for an array. C provides an
alternative—the calloc function—that's sometimes better, calloc has the
following prototype in cstdlib. h>:
void *calloc(size_t nmemb, size_t
size);
calloc
allocates space for an array with nmemb elements, each of which is size bytes
long; it returns a null pointer if the requested space isn't available. After
allocating the memory, calloc initializes it by setting all bits to 0. For
example, the following call of calloc allocates space for an array of n
integers, which are all guaranteed to be zero initially:
a = calloc(n, sizeof(int));
Since
calloc clears the memory that it allocates but malloc doesn't, we may
occasionally want to use calloc to allocate space for an object other than an
array. By calling calloc with 1 as its first argument, we can allocate space
for a data item of any type:
struct point { int x, y; } *p;
p = calloc(1, sizeof(struct point));
After
this statement has been executed, p will point to a structure whose x and y
members have been set to zero.
The realloc Function
Once
we've allocated memory for an array, we may later find that it's too large or
too small. The realloc function can resize the array to better suit our needs.
The following prototype for realloc appears in :
void *realloc(void *ptr, size_t
size);
When realloc is called, ptr must point to a memory block
obtained by a previous call of malloc, calloc, or realloc. The size parameter
represents the new size of the block, which may be larger or smaller than the
original size. Although realloc doesn't require that ptr point to memory that's
being used as an array.
The
C standard spells out a number of rules concerning the behavior of realloc:
When it expands a memory block,
realloc doesn't initialize the bytes that are added to the block.
If realloc can't enlarge the memory
block as requested, it returns a null pointer; the data in the old memory block
is unchanged.
If realloc is called with a null
pointer as its first argument, it behaves like malloc.
If realloc is called with 0 as its
second argument, it frees the memory block.
The
C standard stops short of specifying exactly how realloc works. Still, we
expect it to be reasonably efficient. When asked to reduce the size of a memory
block, realloc should shrink the block "in place," without moving the
data stored in the block. By the same token, realloc should always attempt to
expand a memory block without moving it. If it's unable to enlarge the block
(because the bytes following the block are already in use for some other
purpose), realloc will allocate a new block elsewhere, then copy the contents
of the old block into the new one.
Deallocating
Storage
malloc
and the other memory allocation functions obtain memory blocks from a storage
pool known as the heap. Calling these functions too often—or
asking them for large blocks of memory—can exhaust the heap, causing the
functions to return a null pointer.
To make matters worse, a program may allocate blocks of
memory and then lose track of them, thereby wasting space. A block of memory
that's no longer accessible to a program is said to be garbage. A
program that leaves garbage behind has a memory leak. Some languages
provide a garbage collector that automatically locates and recycles
garbage, but C doesn't. Instead, each C program is responsible for recycling
its own garbage by calling die free function to release unneeded memory.
The free Function
The free function has the following
prototype in :
void free(void*ptr);
Using free is easy; we simply pass it a pointer to a memory
block that we no longer need:
p = malloc (...);
q = malloc (...) ;
free(p);
P = q;
Calling
free releases the block of memory that p points to. This block is now available
for reuse in subsequent calls of malloc or other memory allocation functions.
The
argument to free must be a pointer that was previously returned by a memory
allocation function. (The argument may also be a null pointer, in which case
the call of free has no effect.) Passing free a pointer to any other object
(such as a variable or array element) causes undefined behavior.
The "Dangling Pointer"
Problem
Although
the free function allows us to reclaim memory that's no longer needed, using it
leads to a new problem: dangling pointers. The call free (p) deallocates
the memory block that p points to, but doesn't change p itself. If we forget
that p no longer points to a valid memory block, chaos may ensue:
char *p = malloc(4);
free(p);
strcpy(p,
"abc"); /*** WRONG ***/
Modifying
the memory that p points to is a serious error, since our program no longer has
control of that memory.
Attempting
to access or modify a deallocated memory block causes undefined behavior.
Trying to modify a deallocated memory block is likely to have disastrous
consequences that may include a program crash.
Dangling pointers can be hard to spot, since several
pointers may point to the same block of memory. When the block is freed, all
the pointers are left dangling.
Linked Lists
Dynamic
storage allocation is especially useful for building lists, trees, graphs, and
other linked data structures. A linked list consists of a chain of
structures (called nodes), with each node containing a pointer to the
next node in the chain.The last node in the list contains a null pointer, shown
here as a diagonal line.In previous chapters, we've used an array whenever
we've needed to store a collection of data items; linked lists give us an alternative.
A linked list is more flexible than an array: we can easily insert and delete
nodes in a linked list, allowing the list to grow and shrink as needed. On the
other hand, we lose the "random access" capability of an array. Any
element of an array can be accessed in the same amount of time: accessing a
node in a linked list is fast if the node is close to the beginning of the
list, slow if it's near the end.
Declaring a Node Type
To
set up a linked list, the First thing we'll need is a structure that represents
a single node in the list. For simplicity, let's assume that a node contains
nothing but an integer (the node's data) plus a pointer to the next node in the
list. Here's what our node structure will look like:
struct node {
int value; /*
data stored in the node */
struct
node *next; /* pointer to the next node */
} ;
Notice that the next member has type struct node *, which
means that it can store a pointer to a node structure. There's nothing special
about the name node.by the way; its just an ordinary structure tag. Now that we have the node structure
declared, we'll need a way to keep track of where the list begins. In other
words, we'll need a variable that always points to the first node in the list.
Let's name the variable first:
struct node *first = NULL;
Setting first to NULL indicates that
the list is initially empty.
Creating a Node
As
we construct a linked list, we'll want to create nodes one by one. adding each
to the list. Creating a node requires three steps:
Allocate memory for the node.
Store data in the node.
Insert the node into the list.
We'll
concentrate on the first two steps for now.
When
we create a node, we'll need a variable that can point to the node temporarily.
until it's been inserted into the list. Let's call this variable new_node:
struct node *new_node;
We'll use malloc to allocate memory for the new node, saving the return
value in new_node:
new_node = malloc(sizeof(struct node));
new_node
now points to a block of memory just large enough to hold a node structure.
Be careful
to give sizeof the name of the type to be allocated, not the name of a pointer
to that type:
new_node = malloc(sizeof(new_node));
/** WRONG ***/
The
program will still compile, but malloc will allocate only enough memory for a
pointer to a node structure.
The à Operator
Before we
go on to the next step, inserting a new node into a list, let's take a moment
to discuss a useful shortcut. Accessing a member of a structure using a pointer
is so common that C provides a special operator just for this purpose. This
operator, known as right arrow selection, is a minus sign
followed by >. Using the à
operator, we can write
new_nodeàvalue = 10;
(*new_node).value = 10;
The à
operator is a combination ot the * and . operators; it performs indirection on
new_node to locale the structure that it points to, then selects the value
member of the structure.
The à operator produces an lvalue, so we can use it wherever an
ordinary variable would be allowed. We've just seen an example in which
new_nodeà value appears on the left side of
an assignment. It could just as easily appear in a call of scanf:
scanf("%d", &new nodeàvalue);
Notice
lhat the & operator is still required, even though new_node is a pointer.
Without the we'd be passing scanf the value of new_nodeàvalue, which has type int.
Inserting a Node at the Beginning of
a Linked List
One
of the advantages of a linked list is that nodes can be added at any point in
the list: at the beginning, at the end, or anywhere in the middle. The
beginning of a list is the easiest place to insert a node, however, so let's
focus on that case.
If
new_node is pointing to the node to be inserted, and first is pointing to the
first node in the linked list, then we'll need two statements to insert the
node into the list. First, we'll modify the new node's next member to point to
the node that was previously at the beginning of the list:
new_nodeànext = first;
No comments:
Post a Comment