Tokenizer Improvements

  1. Changed the default Tokenizer Function for tokenizer to char_separator.
  2. Added more tests to the examples including tests with just InputIterators
  3. Changed offset_separator, char_delimiters_separator, char_separator to use tok.assign when the iterator for the sequence they are dealing with is at least a forward iterator. If it is not, then they use tok+= as they do currently.
  4. TokenizerFunctions operator() and reset() are now const functions. If they need to maintain information beyond the position of the iterators, they typedef mutable_type to the type of the variable they need. A reference to mutable_type will get passed in to them. By making operator() and reset() const, this enables the TokenizerFunction to be shared across all instances of iterators for 1 tokenizer.

Acknowledgements

I would like to thankGennadiy E. Rozental for suggesting using tok.assign instead of tok+=.  I would also like to thank George A. Heintzelman  for his idea of  distinguishing the const and non-const aspects of TokenizerFunction.

Downloading and Usage

Download from here. To use, unzip it and put it in your include path before the regular boost library.

Comments

I would love to hear whatever comments you have. I can be reached at jbandela@ufl.edu and I also read the boost list. Please post comments to the boost list with "Tokenizer Improvements" somewhere in the subject line so I can easily see find them.