Here’s a link back to the GitHub project page.
Introduction
A set of basic dynamic string macros for C programs are included with
uthash in utstring.h
. To use these in your own C program, just copy
utstring.h
into your source directory and use it in your programs.
#include "utstring.h"
The dynamic string supports operations such as inserting data, concatenation, getting the length and content, substring search, and clear. It’s ok to put binary data into a utstring too. The string operations are listed below.
Some utstring operations are implemented as functions rather than macros.
Download
To download the utstring.h
header file,
follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file,
then look in the src/ sub-directory.
BSD licensed
This software is made available under the revised BSD license. It is free and open source.
Platforms
The utstring macros have been tested on:
-
Linux,
-
Windows, using Visual Studio 2008 and Visual Studio 2010
Usage
Declaration
The dynamic string itself has the data type UT_string
. It is declared like,
UT_string *str;
New and free
The next step is to create the string using utstring_new
. Later when you’re
done with it, utstring_free
will free it and all its content.
Manipulation
The utstring_printf
or utstring_bincpy
operations insert (copy) data into
the string. To concatenate one utstring to another, use utstring_concat
. To
clear the content of the string, use utstring_clear
. The length of the string
is available from utstring_len
, and its content from utstring_body
. This
evaluates to a char*
. The buffer it points to is always null-terminated.
So, it can be used directly with external functions that expect a string.
This automatic null terminator is not counted in the length of the string.
Samples
These examples show how to use utstring.
#include <stdio.h>
#include "utstring.h"
int main() {
UT_string *s;
utstring_new(s);
utstring_printf(s, "hello world!" );
printf("%s\n", utstring_body(s));
utstring_free(s);
return 0;
}
The next example demonstrates that utstring_printf
appends to the string.
It also shows concatenation.
#include <stdio.h>
#include "utstring.h"
int main() {
UT_string *s, *t;
utstring_new(s);
utstring_new(t);
utstring_printf(s, "hello " );
utstring_printf(s, "world " );
utstring_printf(t, "hi " );
utstring_printf(t, "there " );
utstring_concat(s, t);
printf("length: %u\n", utstring_len(s));
printf("%s\n", utstring_body(s));
utstring_free(s);
utstring_free(t);
return 0;
}
The next example shows how binary data can be inserted into the string. It also clears the string and prints new data into it.
#include <stdio.h>
#include "utstring.h"
int main() {
UT_string *s;
char binary[] = "\xff\xff";
utstring_new(s);
utstring_bincpy(s, binary, sizeof(binary));
printf("length is %u\n", utstring_len(s));
utstring_clear(s);
utstring_printf(s,"number %d", 10);
printf("%s\n", utstring_body(s));
utstring_free(s);
return 0;
}
Reference
These are the utstring operations.
Operations
|
allocate a new utstring |
|
allocate a new utstring (if s is |
|
free an allocated utstring |
|
init a utstring (non-alloc) |
|
dispose of a utstring (non-allocd) |
|
printf into a utstring (appends) |
|
insert binary data of length len (appends) |
|
concatenate src utstring to end of dst utstring |
|
clear the content of s (setting its length to 0) |
|
obtain the length of s as an unsigned integer |
|
get |
|
forward search from pos for a substring |
|
reverse search from pos a substring |
New/free vs. init/done
Use utstring_new
and utstring_free
to allocate a new string or free it. If
the UT_string is statically allocated, use utstring_init
and utstring_done
to initialize or free its internal memory.
Substring search
Use utstring_find
and utstring_findR
to search for a substring in a utstring.
It comes in forward and reverse varieties. The reverse search scans from the end of
the string backward. These take a position to start searching from, measured from 0
(the start of the utstring). A negative position is counted from the end of
the string, so, -1 is the last position. Note that in the reverse search, the
initial position anchors to the end of the substring being searched for-
e.g., the t in cat. The return value always refers to the offset where the
substring starts in the utstring. When no substring match is found, -1 is
returned.
For example if a utstring called s
contains:
ABC ABCDAB ABCDABCDABDE
Then these forward and reverse substring searches for ABC
produce these results:
utstring_find( s, -9, "ABC", 3 ) = 15
utstring_find( s, 3, "ABC", 3 ) = 4
utstring_find( s, 16, "ABC", 3 ) = -1
utstring_findR( s, -9, "ABC", 3 ) = 11
utstring_findR( s, 12, "ABC", 3 ) = 4
utstring_findR( s, 2, "ABC", 3 ) = 0
"Multiple use" substring search
The preceding examples show "single use" versions of substring matching, where the internal Knuth-Morris-Pratt (KMP) table is internally built and then freed after the search. If your program needs to run many searches for a given substring, it is more efficient to save the KMP table and reuse it.
To reuse the KMP table, build it manually and then pass it into the internal search functions. The functions involved are:
_utstring_BuildTable (build the KMP table for a forward search)
_utstring_BuildTableR (build the KMP table for a reverse search)
_utstring_find (forward search using a prebuilt KMP table)
_utstring_findR (reverse search using a prebuilt KMP table)
This is an example of building a forward KMP table for the substring "ABC", and then using it in a search:
long *KPM_TABLE, offset;
KPM_TABLE = (long *)malloc( sizeof(long) * (strlen("ABC")) + 1));
_utstring_BuildTable("ABC", 3, KPM_TABLE);
offset = _utstring_find(utstring_body(s), utstring_len(s), "ABC", 3, KPM_TABLE );
free(KPM_TABLE);
Note that the internal _utstring_find
has the length of the UT_string as its
second argument, rather than the start position. You can emulate the position
parameter by adding to the string start address and subtracting from its length.