documentation: MAP layer tutoriallibrary: avlmap

Programming with the MAP layer of avlmap-0.9.15

The MAP API layer of the avlmap-0.9.15 library provides an isolation from the details of the underlying AVL binary tree implementation. It provides a limited data abstract based on the common data types used in C programs. Closer access to the binary tree implementation for special purposes is available through the AVL API layer.

A mapping is a relationship between one value, the key, and another value, the data. This allows storing and retrieving of data using the key as the means to identify which data. This kind of facility is also known in other systems as an associative array, or a hash (typically because those implementations use a different algorithm called hashing). The avlmap-0.9.15 library is intended for C programming, but may also be used in C++. Experience with programming in C is assumed in this tutorial.


Getting started

It is assumed at this point that the avlmap-0.9.15 library and headers are properly installed on your system.

Including the header

In order to use the MAP layer of avlmap in a C program, the header file must first be included to provide to the C compiler the macros, symbols and declarations used in avlmap. This is simply done with the following code:

#include <map.h>
at the beginning of each C source file that makes use of any parts of the MAP layer of avlmap. Now the C program is able to define MAP handles and correctly make calls to MAP layer functions.

Choosing a key type

Each mapping has only one key type which you must choose when coding the creation of the mapping. If your data is complex and has a mix of keys of different types, you will need to create separate mappings for each different type of data, and select the mapping to work with based on however your program figures out what type of key it has. The most common mapping key that programmers will end up using is the string of characters, or more simply a string. String are most common because most computer data is data that humans work with, and that data is organized as strings of characters in the various human languages we use every day. avlmap functions properly not only for strings using the ASCII character code set, but also for strings using the UTF-8 encoded Unicode character code set. Since avlmap also supports strings of larger integers than just the C character type, it can also be used to work with 16 bit and 32 bit characters that are stored as strings of the larger integers used to keep those codes.

Creating a mapping

Before a mapping can be used it must first be created. A variable must also be defined to act as the handle for the mapping. This variable is actually implemented as a pointer, but you should never use that handle in any way other than by passing it to the avlmap functions, The handle may be defined like:

MAP mymap;
At an appropriate place in your program, before any use of the mapping takes place, create the mapping via a call to one of the new functions that has the appropriate key type. We will use the unsigned character string key type and create a mapping like so:
mymap = map_str_new(NULL);

if (!mymap) {
fprintf(stderr,"The mapping was not created.\n");
exit(1);
}
The value NULL in the argument to the function call means we are using the standard collation method.

At the end of use of a mapping, it should be destroyed in order to free up memory resources. This may be omitted if your operating system or language environment provides for total memory cleanup for you. This is done without regard to what the original key type was as follows:

map_destroy(mymap);

Inserting data

There are two steps involved in inserting data into a mapping. The first step is to make an empty member associated with a particular key. The second step is to assign a data value to the member.

The empty member is inserted into the mapping using the key we want to use to retrieve the data. Since the mapping was created for keys of ucs, the appropriate insert function must be used. The reason for using different functions for different types is because argument type checking is preserved. We now insert one member of the mapping:

map_str_insert(mymap,"texas");
At this point we now have a member which can be accessed using the key value of "texas". There is no data assigned to that member. The mapping now remembers this new member as the active member.

Now we assign data to the active member of the mapping. Since the mapping remembers which member is active, we do not make references to its key.

map_str_store(mymap,"austin");
In this case, the ucs part of the function name refers to the data type, not the key type. It is possible to store data of different types in the same mapping. A mapping does not have to use the same type for key and data, although in practice you are likely to find that to be the most common occurence.

We can now insert some more data into the mapping:

map_str_insert(mymap,"illinois");
map_str_store(mymap,"springfield");
map_str_insert(mymap,"ohio");
map_str_store(mymap,"columbus");

Retrieving data

In a way similar to inserting data, retrieving data also involves two steps. The first step is to find the member of the mapping with the appropriate key. The second step is to fetch the data from that member.

A member in a mapping is found using a key value that is compared against the keys of the members to find the member that matches. The avlmap library has the ability to find members that are either exactly matching, or are the nearest to an exact match with no exact match is available. The matching behaviour is specified during the find function call. The find function returns a negative value if the member is not found. It returns zero if the member is found but the member has no data. It returns a positive value identifying the data type that was stored if there is data there.

n = map_str_find_exact(mymap,"illinois");
if (n<0) {
fprintf(stderr,"The member was not found.\n");
exit(1);
}
if (n==0) {
fprintf(stderr,"The member had no data.\n");
exit(1);
}
Once the find function is successful (and it is even if the data is unassigned) then it will remember this member as the active member.

Now the data value may be retrieved from the active member. Strings are more complicated in C programming than simple data like integers. The avlmap library handles some of the different ways to handle strings with different functions. In this simple demonstration, a duplicate data string will be allocated.

data = map_str_fetch_dup(mymap,NULL);
printf("The data value is %s\n",data);
free(data);
It is also possible to retrieve the key of the active member:
data = map_str_fetch_dup(mymap,NULL);
key = map_str_key_dup(mymap,NULL);
printf("The capitol city of %s is %s\n",key,data);
free(data);
free(key);

Deleting data

Once a mapping has an active member, that active member may be deleted. The same function call to delete the active member is used for every different mapping key type.

map_delete(mymap);

It is also possible to mass delete all members of a mapping while not destroying the mapping.

map_empty(mymap);

Example programs

Some example programs that use the MAP layer of the avlmap library, as well as the AVL layer, are included in the avlmap source code. These may be studied to see a complete C program that uses the avlmap library.


Advanced Features

Key and data types

The avlmap library supports a large number of key and data types:

The pointer data type is collated based on the internal machine address contained within the pointer. No referencing of the pointer value is performed by the avlmap library for the pointer type. It does provide a means to associate extra data with pointer based data structures used elsewhere in a program when referencing that extra data by other keys is not convenient.

The string and array types are really the same type. They differ only in the way they are called. The string type function argument lists do not include a length argument. Most C strings are terminated by a special terminator character which the avlmap string functions look for to determine the length of the string. If a string may contain binary data or otherwise have that terminator code as a valid data character, the length needs to be specified separately. The array functions serve that purpose by having a separate argument for the calling to provide the length. A mapping that uses a string as a key is interchangeable with a mapping that uses an array as a key as long as the same string or array element size, and signed or unsigned attribute, is used.

The scalar types are simpler, and retrieval of data values or keys is simply done by means of the function return value. The string and array types are more complicated due to the scalability of the data involved. Retrieval of string or array data values or keys is done by 4 different mechanisms:

Obtaining a shared pointer is dangerous because the calling program could modify the data in the mapping inappropriately. This is especially dangerous when obtaining a shared pointer to a key, since such modification will corrupt the mapping. The shared pointer is provided for instances when the overhead of allocating memory or copying the string or array is undesireable.

The functions to copy data or key to a specified destination include the ability to specify a maximum length for the destination location. The number of elements copied is one less than the specified number and then a terminating element is always added to the end, even for arrays. This can truncate the copied data. The original length of the string or array is returned by the copy functions, regardless of any truncation.

The duplicate functions make a duplicate string or array and return a pointer to it. They also include an optional argument which points to where to store the length. The calling program is responsible for freeing the memory that was allocated.

Ordered retrieval

Members of a mapping may be retreived sequentially in collated order. A forward function, as well as a reverse function, is provided which will step the active member to the next member in order. There are also first and last functions which cause the first or last member of a mapping to become the active member.

Looped retrieval

Two looping construct macros are provided which can be coded in place of a while or for clause in the C code. They will then iterate the mapping from the first to the last member, or from the last to the first member, with the body of the code being able to access the specific member. An example looks like:

map_loop_forward(mymap) {
data = map_str_fetch_dup(mymap,NULL);
key = map_str_key_dup(mymap,NULL);
printf("The capitol city of %s is %s\n",key,data);
free(data);
free(key);
}

Near keys

If an exact match for a key cannot be found, it is possible to request the nearest lower or nearest higher key to be matched in its place. This can be especially valuable for floating point keys, but may be used for any key type. The exact behaviour of the matching is specified as part of the find function name, or may be specified by a numeric value argument in a generic find function.

Duplicate keys

Duplicate members with the exact same key may be inserted through the use of a special version of the insert functions for each key type. The find functions can be oriented to select any of the duplicate keys, or to prefer the lower duplicate in sequence, or the higher duplicate in sequence. Duplicate keys can be useful when data has multiple instances of the same name, such as in some web form input elements name/value pairs.

Custom collation

A different collating sequence may be used for a mapping at the time it is created. This is done by creating a function which performs the comparison between two data elements of the key type involved, and returning an int value indicating the result of the comparison. For mappings with a scalar key, the comparison function will be given 2 arguments. For mappings with a string or array key, the comparison function will be given 4 arguments where the first 2 are the pointer and number of elements for one key, and the next 2 are the pointer and number of elements for the other key to be compared to the first.


The documentation, source code, and derived compilations are
Copyright © 2000 by Philip Howard.
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA