Introduction
SuperString is an efficient string library for C++, that achieves a remarkable memory and CPU optimization.
SuperString uses Rope (data structure) and optimization techniques.
Features
- Fast and Memory-optimized.
- Automatically garabage collected.
- Support ASCII, UTF-8, UTF-16BE and UTF-32.
- Rich API.
- Easy to integrate and use.
- MIT Licence
Table of contents
- Introduction
- Features
- Table of contents
- Contribute and support
- Install and use
- API
Contribute and support
You have any feature idea, a bug to correct or an improvement, feel free to open a issue or send your pull request.
Install and use
Using CMake
In your project, clones SuperString to a directory where third-party libraries live (let's call it ext).
mkdir ext && cd ext
git clone https://github.com/btwael/SuperString.git
Now, you will need to add those lines to your CMakeLists.txt
# include SuperString
add_subdirectory(ext/SuperString)
# add SuperString headers to include directory
include_directories(ext/SuperString/include)
# link your executable against SuperString
target_link_libraries(myexecutable SuperString)
Without CMake
The header file that contains SuperString declarations is SuperString/include/SuperString.hh, the source file that contains the definitions is SuperString/src/SuperString.cc, use them as you prefer.
API
Construct a new string
As mentioned above, SuperString is automatically garbage collected, so you don't have to think about how and when to free a SuperString instance. To allow this, there're are two way to create a SuperString using static method SuperString::Const or SuperString::Copy.
#include <iostream>
#include "SuperString.hh"
SuperString myFunc() {
char chars[] = "I'm using SuperString!";
SuperString string = SuperString::Copy(chars);
return string;
}
char seq[] = "SuperString is cool!";
int main(int argc, char const *argv[]) {
SuperString s1 = myFunc();
SuperString s2 = SuperString::Const(seq);
// equivalent to SuperString::Const("SuperString is cool!");
std::cout << s1 << "\n" << s2;
return 0;
}
In myFunc, we used SuperString::Copy because the sequence that we're building our string from, has a limited lifetime and well be deleted once the function returns, that why we use ::Copy to tell SuperString that we should copy the data and keep them for further use.
In the other hand, we used ::Const in main because the sequence will live as long as the executable lives, that's important because SuperString will not copy the sequence to avoid memory redundancy.
Static methods
SuperString::Const
This static method creates a new SuperString from a const sequence of characters of a given supported encoding, the encoding paramter has SuperString::Encoding::UTF8 as default value. This method does not replicate string data in memory
#include <iostream>
#include "SuperString.hh"
char asciiseq[] = "SuperString is cool!";
SuperString::Byte utf8seq[] = {0xe2, 0x82, 0xac, 0x00};
SuperString::Byte utf16beseq[] = {0x00, 0x24, 0x00, 0x00};
int utf32seq[] = {0x10437, 0x0000};
int main(int argc, char const *argv[]) {
SuperString s1 = SuperString::Const(asciiseq, SuperString::Encoding::ASCII);
SuperString s2 = SuperString::Const(utf8seq, SuperString::Encoding::UTF8);
SuperString s3 = SuperString::Const(utf16beseq, SuperString::Encoding::UTF16BE);
SuperString s4 = SuperString::Const(utf32seq, SuperString::Encoding::UTF32);
std::cout << s1 << "\n"; // SuperString is cool!
std::cout << s2 << "\n"; // €
std::cout << s3 << "\n"; // $
std::cout << s4 << "\n"; // 𐐷
return 0;
}
SuperString::Copy
This static method creates a new SuperString from a const sequence of characters of a given supported encoding, the encoding paramter has SuperString::Encoding::UTF8 as default value. This method copys the given sequence to a new allocated memory space.
#include <iostream>
#include "SuperString.hh"
SuperString getString(int i) {
char asciiseq[] = "SuperString is cool!";
SuperString::Byte utf8seq[] = {0xe2, 0x82, 0xac, 0x00};
SuperString::Byte utf16beseq[] = {0x00, 0x24, 0x00, 0x00};
int utf32seq[] = {0x10437, 0x0000};
switch(i) {
case 0:
return SuperString::Copy(asciiseq, SuperString::Encoding::ASCII);
case 1:
return SuperString::Copy(utf8seq, SuperString::Encoding::UTF8);
case 2:
return SuperString::Copy(utf16beseq, SuperString::Encoding::UTF16BE);
case 3:
return SuperString::Copy(utf32seq, SuperString::Encoding::UTF32);
default:
return SuperString::Copy("Nothing"); // by default this is UTF8
}
}
int main(int argc, char const *argv[]) {
std::cout << getString(0) << "\n"; // SuperString is cool!
std::cout << getString(1) << "\n"; // €
std::cout << getString(2) << "\n"; // $
std::cout << getString(3) << "\n"; // 𐐷
return 0;
}
Methods
codeUnitAt(index)
Returns the code unit at the given index, if index is less than the length of the string, if not, it returns SuperString::Error::RangeError.
#include <iostream>
#include <cstddef>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString::Result<int, SuperString::Error> res;
SuperString s = SuperString::Const("SuperString is fast!");
res = s.codeUnitAt(1);
if(res.isOk()) {
std::cout << "Valid index, code unit: " << res.ok() << "\n";
}
res = s.codeUnitAt(100);
if(res.isErr()) { // && res.err() == SuperString::Error::RangeError
std::cout << "Range error\n";
}
for(std::size_t i = 0; i < s.length(); i++) {
std::cout << s.codeUnitAt(i).ok() << "\n"; // sometime it's just safe
}
return 0;
}
indexOf(pattern)
Returns the position of the first occurrence of other in this string, if not found, it returns SuperString::Error::NotFound.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString::Result<std::size_t, SuperString::Error> res;
SuperString s = SuperString::Const("SuperString is fast and fast!");
res = s.indexOf(SuperString::Const("fast"));
if(res.isOk()) {
std::cout << res.ok() << "\n"; // 15
} else {
std::cout << "Not found" << "\n";
}
return 0;
}
isEmpty()
Returns true if this string is empty.
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s1;
SuperString s2 = SuperString::Const("");
SuperString s3 = SuperString::Const("SuperString");
s1.isEmpty(); // true
s2.isEmpty(); // true
s3.isEmpty(); // false
return 0;
}
isNotEmpty()
Returns true if this string is not empty.
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s1;
SuperString s2 = SuperString::Const("");
SuperString s3 = SuperString::Const("SuperString");
s1.isNotEmpty(); // false
s2.isNotEmpty(); // false
s3.isNotEmpty(); // true
return 0;
}
lastIndexOf(pattern)
Returns the position of the last occurrence of other in this string, if not found, it returns SuperString::Error::NotFound.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString::Result<std::size_t, SuperString::Error> res;
SuperString s = SuperString::Const("SuperString is fast and fast!");
res = s.lastIndexOf(SuperString::Const("fast"));
if(res.isOk()) {
std::cout << res.ok() << "\n"; // 24
} else {
std::cout << "Not found" << "\n";
}
return 0;
}
length()
Returns the length of this string.
#include <iostream>
#include "SuperString.hh"
char asciiseq[] = "SuperString is cool!";
SuperString::Byte utf8seq[] = {0xe2, 0x82, 0xac, 0x00};
SuperString::Byte utf16beseq[] = {0x00, 0x24, 0x00, 0x00};
int utf32seq[] = {0x10437, 0x0000};
int main(int argc, char const *argv[]) {
SuperString s1 = SuperString::Const(asciiseq, SuperString::Encoding::ASCII);
SuperString s2 = SuperString::Const(utf8seq, SuperString::Encoding::UTF8);
SuperString s3 = SuperString::Const(utf16beseq, SuperString::Encoding::UTF16BE);
SuperString s4 = SuperString::Const(utf32seq, SuperString::Encoding::UTF32);
std::cout << s1.length() << "\n"; // 20
std::cout << s2.length() << "\n"; // 1
std::cout << s3.length() << "\n"; // 1
std::cout << s4.length() << "\n"; // 1
return 0;
}
print(stream)
Prints this string to the given stream.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s = SuperString::Const("SuperString is fast and fast!");
s.print(std::cout); // equivalent to: std::cout << s;
return 0;
}
print(stream, startIndex, endIndex)
Prints a substring of this string that starts at startIndex, inclusive and end at endIndex, exclusive.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s = SuperString::Const("SuperString is fast and fast!");
s.print(std::cout, 0, 11); // Will print: SuperString
return 0;
}
substring(startIndex, endIndex)
Returns the substring of this string that extends from startIndex, inclusive, to endIndex, exclusive.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString::Result<SuperString, SuperString::Error> res;
SuperString s = SuperString::Const("SuperString is fast and fast!");
res = s.substring(0, 11);
if(res.isOk()) {
std::cout << res.ok() << "\n"; // Will print: SuperString
} else {
std::cout << "Range error" << "\n";
}
return 0;
}
trim()
Returns the string without any leading and trailing whitespace.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s = SuperString::Const(" \tSuperString ");
std::cout << s.trim(); // Will print "SuperString" not " \tSuperString "
return 0;
}
trimLeft()
Returns the string without any leading whitespace.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s = SuperString::Const(" \tSuperString ");
std::cout << s.trimLeft(); // Will print "SuperString " not " \tSuperString "
return 0;
}
trimRight()
Returns the string without any trailing whitespace.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s = SuperString::Const(" \tSuperString ");
std::cout << s.trimRight(); // Will print " \tSuperString" not " \tSuperString "
return 0;
}
Operators
operator *
Creates a new string that concatenate this string with itself a number of times.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s = SuperString::Const("bla");
s = s * 3;
std::cout << s << "\n"; // blablabla
std::cout << s.substring(2, 7).ok() << "\n"; // ablab
return 0;
}
operator +
Creates a new string by concatenating this string with other.
#include <iostream>
#include "SuperString.hh"
int main(int argc, char const *argv[]) {
SuperString s1 = SuperString::Const("bla");
SuperString s2 = SuperString::Const("kla");
SuperString s = s1 + s2 + s1;
std::cout << s << "\n"; // blaklabla
std::cout << s.substring(2, 9).ok() << "\n"; // aklabla
return 0;
}
Nested types
SuperString::Encoding
This type is defined as fellow:
class SuperString {
...
enum class Encoding {
ASCII,
UTF8,
UTF16BE,
UTF32
};
...
};
SuperString::Error
This type is defined as fellow:
class SuperString {
...
enum class Error {
Unimplemented,
Unexpected, // Something that never happens, Unreachable code
RangeError,
InvalidByteSequence,
NotFound
};
...
};
SuperString::Byte
This type is defined as fellow:
class SuperString {
...
typedef unsigned char Byte;
...
};
SuperString::Result<T, E>
This type is inspired from Rust type std::Result<T, E>, and defined as:
class SuperString {
...
template<class T, class E>
class Result {
private:
char *_ok;
char *_err;
public:
Result(T ok);
Result(E err);
Result(const SuperString::Result<T, E> &other) /*copy*/;
~Result();
/**
* Returns the error value.
*/
E err() const;
/**
* Returns true if the result is Ok.
*/
bool isErr() const;
/**
* Returns true if the result is Err.
*/
bool isOk() const;
/**
* Returns the success value.
*/
T ok() const;
/**
* Sets this to Err with given [err] value.
*/
void err(E err);
/**
* Sets this to Ok with given [ok] value.
*/
void ok(T ok);
SuperString::Result<T, E> &operator=(const SuperString::Result<T, E> &other);
};
...
};