String handling package for embedded systems

This is a package of string manipulation routines. All of these routines have one important property, namely that they're bounded. I've seen way too many bugs caused by unbounded routines (like strcpy and sprintf), that I generally refuse to use them.

Here are the files:

Str.h Header file
StrMaxCpy.cpp Contains StrMaxCpy function.
StrMaxCat.cpp Contains StrMaxCat function.
StrPrintf.cpp Contains StrPrintf, vStrPrintf, StrXPrintf, and vStrXPrintf functions.
StrToken.h Header for the String Tokenizer class.
StrToken.cpp Implementation for the String Tokenizer class.

StrMaxCpy, StrMaxCat, and the StrPrintf functions are really C files, but I like using them as C++ file so that I get type-safe linking. You can rename these to .c files, if desired.

char *StrMaxCpy( char *dst, const char *src, size_t maxLen )

Copies src to dst, ensuring that the length of dst doesn't exceed maxLen - 1. dst is guaranteed to always be terminated by the end-of-string indicator ('\0').

char *StrMaxCat( char *dst, const char *src, size_t maxLen )

Concatenates src to dst, ensuring that the length of dst doesn't exceed maxLen - 1. dst is guaranteed to always be terminated by the end-of-string indicator ('\0'). Returns a pointer to dst except if maxLen is < 0, in which case a pointer to an empty string is returned.

int StrPrintf( char *outStr, int maxLen, const char *fmt, ... )
int StrXPrintf( StrXPrintfFunc outFunc, void *outParm, const char *fmt, ... )
int vStrPrintf( char *outStr, int maxLen, const char *fmt, va_list args )
int vStrXPrintf( StrXPrintfFunc outFunc, void *outParm, const char *fmt, va_list args )

Generic, reentrant printf function.

The format string fmt consists of ordinary characters, escape sequences, and format specifications. The ordinary characters and escape sequences are output in their order of appearance. Format specifications start with a percent sign (%) and are read from left to right. When the first format specification (if any) is encountered, it converts the value of the first argument after fmt and outputs it accordingly. The second format specification causes the second argument to be converted and output, and so on. If there are more arguments than there are format specifications, the extra arguments are ignored. The results are undefined if there are not enough arguments for all the format specifications.

A format specification has optional, and required fields, in the following form:

    %[flags][width][.precision][l]type

Each field of the format specification is a single character or a number specifying a particular format option. The simplest format specification contains only the percent sign and a type character (for example %s). If a percent sign is followed by a character that has no meaning as a format field, the character is sent to the output function. For example, to print a percent-sign character, use %%.

The optional fields, which appear before the type character, control other aspects of the formatting, as follows:

flags may be one of the following:

- (minus sign) left align the result within the given field width.
0 (zero) Zeros are added until the minimum width is reached.

width may be one of the following:

a number specifying the minimum width of the field
* (asterick) means that an integer taken from the argument list will be used to provide the width. The width argument must precede the value being formatted in the argument list.

precision may be one of the following:

a number
* (asterick) means that an integer taken from the argument list will be used to provide the precision. The precision argument must precede the value being formatted in the argument list.

The interpretation of precision depends on the type of field being formatted:

for b, d, o, u, x, X, the precision specifies the minimum number of digits that will be printed. If the number of digits in the argument is less than precision, the output value is padded on the left with zeros. The value is not truncated when the number of digits exceeds precision.
for s, the precision specifies the maximum number of characters to be printed.

The optional type modifier l (lowercase ell), may be used to specify that the argument is a long argument. This makes a difference on architectures where the sizeof an int is different from the sizeof a long.

type causes the output to be formatted as follows:

b Unsigned binary integer.
c Character.
d Signed decimal integer.
o Unsigned octal integer.
s Null terminated character string.
u Unsigned Decimal integer.
x Unsigned hexadecimal integer, using "abcdef".
X Unsigned hexadecimal integer, using "ABCDEF".

outFunc will be called to output each character. If outFunc returns a number >= 0, then vStrXPrintf or StrXPrintf will continue to call outFunc with additional characters.

If outFunc returns a negative number, then vStrXPrintf will stop calling outFunc and will return the non-negative return value.

StrTokenizer is basically a C++ class to provide a reentrant version of strtok.

StrTokenizer()

Default constructor. You'll need to call Init to use the tokenizer.

StrTokenizer( const char *str, char *outToken, size_t maxLen )

Specified the string to be tokenized, as well as the place to store the tokens. maxLen specifies the maximum length of outToken. void Init( const char *str, char *outToken, size_t maxLen )

Allows objects constructed using the default constructor to be initialized.

char *NextToken( const char *delim )

Returns a pointer to the next token. Initial characters contained in delim are skipped. The first character not contained in delim will be the beginning of the token. The next occurence of a character from delim will terminate the token.

const char *Remainder() const

Returns the remaining untokenized portion of the string.

Home - Line Maze 2006 - Mini RoboMind