TastyLib is a c++ library of data structures and algorithms.
It is also a header-only library, which means that you could just copy the include/tastylib
directory to your project's include path and use it without caring about the issues related to link libraries.
Linux | Windows | Coverage |
---|---|---|
Contents below show the data structures and algorithms available in this project. Just click the links at the name
column to see the details of their usages and benchmarks. All benchmarks are run with the -O2
compiler flag under the Release
version.
Name | Source | Benchmarked | Note | Reference |
---|---|---|---|---|
DoublyLinkedList | Unit test DoublyLinkedList.h |
Yes | A linked data structure that consists of a set of sequentially linked records. It also supports merge sort. | Wikipedia |
BinaryHeap | Unit test BinaryHeap.h |
Yes | A heap data structure taking the form of a complete binary tree. A common way of implementing priority queue . | Wikipedia |
HashTable | Unit test HashTable.h |
No | A data structure that stores unique elements in no particular order, and which allows for fast retrieval of individual elements based on their values. Similar to std::unordered_set . | Wikipedia |
AVLTree | Unit test AVLTree.h |
Yes | A self-balancing binary search tree. | Wikipedia |
Graph | Unit test Graph.h |
No | A data structure to implement the directed/undirected graph concepts from mathematics. It stores a graph in an adjacency list or matrix. | Wikipedia |
Name | Source | Benchmarked | Note | Reference |
---|---|---|---|---|
MD5 | Unit test MD5.h |
Yes | A widely used hash function producing a 128-bit hash value. | Wikipedia |
NPuzzle | Unit test NPuzzle.h |
Yes | A classic searching problem solved with A* search . A GUI demo has been provided. | Wikipedia |
Sort | Unit test Sort.h |
Yes | Including insertion sort , selection sort , heapsort , quicksort , quickselect . For merge sort , please refer toDoublyLinkedList.sort(). | Wikipedia |
Dijkstra | Unit test Dijkstra.h |
No | An algorithm to find the shortest paths between vertices in a graph. | Wikipedia |
Install CMake .
Generate build files using the commands below:
Build benchmarks only
$ mkdir build $ cd build $ cmake ..
Build benchmarks and unit tests
$ mkdir build $ cd build $ git submodule init $ git submodule update $ cmake -DTASTYLIB_BUILD_TEST=ON ..
Build files will be generated in the build
directory based on your operating system. Use them to build this project:
Linux | OS X | Windows |
---|---|---|
Makefile | Makefile | Visual Studio Project |
All executables(benchmarks and unit tests) will be generated in the bin
directory. To run all unit tests together, use command below:
$ ctest --verbose
#include "tastylib/DoublyLinkedList.h" using namespace tastylib; int main() { DoublyLinkedList<int> list; auto isEmpty = list.isEmpty(); // isEmpty == true list.insertBack(1); // List content: 1 list.insertFront(2); // List content: 2 1 list.insert(1, 3); // List content: 2 3 1 list.insert(3, 4); // List content: 2 3 1 4 list.sort(); // List content: 1 2 3 4 auto p1 = list.find(3); // p1 == 2 list.remove(p1); // List content: 1 2 4 list.removeFront(); // List content: 2 4 list.removeBack(); // List content: 2 auto p2 = list.find(3); // p2 == -1 auto size = list.getSize(); // size == 1 return 0; }
Operation | Time |
---|---|
insertFront() | O(1) |
removeFront() | O(1) |
insertBack() | O(1) |
removeBack() | O(1) |
insert() | O(n) |
remove() | O(n) |
find() | O(n) |
sort() (merge sort) | O(nlogn) |
Source: benchmark_DoublyLinkedList.cpp
The program compares the time cost of DoublyLinkedList
with std::list
. When benchmarking find()
and sort()
, the size of the list is 100,000 and 5,000,000 , respectively. Here are the results under different environments:
Operation | std::list | DoublyLinkedList |
---|---|---|
insertFront() | 29 ns | 36 ns |
removeFront() | 15 ns | 12 ns |
insertBack() | 31 ns | 28 ns |
removeBack() | 14 ns | 12 ns |
find() | 165 µs | 230 µs |
sort() | 3432 ms | 3011 ms |
Operation | std::list | DoublyLinkedList |
---|---|---|
insertFront() | 57 ns | 46 ns |
removeFront() | 42 ns | 49 ns |
insertBack() | 55 ns | 48 ns |
removeBack() | 48 ns | 51 ns |
find() | 112 µs | 114 µs |
sort() | 3534 ms | 3138 ms |
Operation | std::list | DoublyLinkedList |
---|---|---|
insertFront() | 65 ns | 66 ns |
removeFront() | 70 ns | 82 ns |
insertBack() | 69 ns | 68 ns |
removeBack() | 73 ns | 79 ns |
find() | 10 ns* | 9 ns* |
sort() | 3717 ms | 3729 ms |
#include "tastylib/BinaryHeap.h" using namespace tastylib; int main() { BinaryHeap<int> heap; // Create a min-root heap auto isEmpty = heap.isEmpty(); // isEmpty == true heap.push(50); heap.push(20); heap.push(30); auto size1 = heap.getSize(); // size1 == 3 auto val1 = heap.top(); // val1 == 20 heap.pop(); auto val2 = heap.top(); // val2 == 30 heap.pop(); auto val3 = heap.top(); // val3 == 50 heap.pop(); auto size2 = heap.getSize(); // size2 == 0 return 0; }
Operation | Time |
---|---|
push() | O(nlogn) |
top() | O(1) |
pop() | O(nlogn) |
Source: benchmark_BinaryHeap.cpp
The program compares the time cost of BinaryHeap
with std::priority_queue
. It calculates the average time cost of each operation. Here are the results under different environments:
Operation | std::priority_queue | BinaryHeap |
---|---|---|
push() | 17 ns | 16 ns |
pop() | 294 ns | 293 ns |
Operation | std::priority_queue | BinaryHeap |
---|---|---|
push() | 23 ns | 22 ns |
pop() | 498 ns | 254 ns |
#include "tastylib/HashTable.h" #include <string> using namespace tastylib; int main() { HashTable<std::string> table; auto isEmpty = table.isEmpty(); // isEmpty == true table.insert("Alice"); table.insert("Darth"); auto size1 = table.getSize(); // size1 == 2 auto hasAlice = table.has("Alice"); // hasAlice == true auto hasDarth = table.has("Darth"); // hasDarth == true table.remove("Darth"); hasAlice = table.has("Alice"); // hasAlice == true hasDarth = table.has("Darth"); // hasDarth == false auto size2 = table.getSize(); // size2 == 1 return 0; }
Operation | Time |
---|---|
insert() | O(1) |
has()/find() | O(1) |
remove() | O(1) |
rehash() | O(n) |
Note that there are many different ways to implement the hash table. The C++ standard library implements the std::unordered_set
as a dynamic hash table, which means that its bucket amount changes dynamically when performing insert()
and remove()/erase()
operations(i.e., using extendible hashing or linear hashing ). While in TastyLib, for simplicity, the hash table is static so its bucket amount is fixed after initialized. Since different implementations have different pros and cons, it's hard to give a convincing benchmark result.
#include "tastylib/AVLTree.h" #include <string> using namespace tastylib; int main() { AVLTree<int> tree; auto isEmpty = tree.isEmpty(); // isEmpty == true tree.insert(1); tree.insert(2); tree.insert(3); tree.insert(3); std::string str1 = tree.preorder(); // str1 == "{2, 1, 3, 3}" std::string str2 = tree.inorder(); // str2 == "{1, 2, 3, 3}" std::string str3 = tree.postorder(); // str3 == "{1, 3, 3, 2}" auto size1 = tree.getSize(); // size1 == 4 auto found1 = tree.has(3); // found1 == true tree.remove(3); std::string str4 = tree.preorder(); // str4 == "{2, 1}" auto size2 = tree.getSize(); // size2 == 2 auto found2 = tree.has(3); // found2 == false return 0; }
Operation | Time |
---|---|
find() | O(logn) |
insert() | O(logn) |
remove() | O(logn) |
Source: benchmark_AVLTree.cpp
The program compares the time cost of AVLTree
with std::multiset
. It calculates the average time cost of each operation. Note that the std::multiset
is implemented as a red-black tree , which is faster than the AVL tree when performing insert()
and remove()
operations(but slower when performing find()
). Here are the results under different environments:
Operation | std::multiset | AVLTree |
---|---|---|
find() | 1245 ns | 1056 ns |
insert() | 1241 ns | 1447 ns |
remove() | 1289 ns | 1728 ns |
Operation | std::multiset | AVLTree |
---|---|---|
find() | 1597 ns | 168 ns* |
insert() | 1570 ns | 1260 ns |
remove() | 233 ns | 436 ns |
#include "tastylib/Graph.h" #include <string> using namespace tastylib; int main() { // Create a graph that has three vertices. // Each vertex stores a string object. Graph<std::string> graph(3); // Modify the string object in each vertex graph[0] = "I am vertex 0."; graph[1] = "I am vertex 1."; graph[2] = "I am vertex 2."; // Add edges graph.setWeight(0, 1, 10); graph.setWeight(0, 2, 20); graph.setWeight(1, 2, 30); // Get edge weights auto w0 = graph.getWeight(0, 1); // w0 == 10 auto w1 = graph.getWeight(0, 2); // w1 == 20 auto w2 = graph.getWeight(1, 2); // w2 == 30 // Get adjacent vertices auto n0 = graph.getNeighbors(0); // n0 == [1, 2] auto n1 = graph.getNeighbors(1); // n1 == [2] auto n2 = graph.getNeighbors(2); // n2 == [] return 0; }
#include "tastylib/MD5.h" using namespace tastylib; int main() { // hashedMsg == "2dabbfd553b67530e4892eb9481121fa", // which is the MD5 value of the message "TastyLib" std::string hashedMsg = MD5<>::getInstance()->hash("TastyLib"); return 0; }
Source: benchmark_MD5.cpp
The program uses the MD5 algorithm to hash a fixed message of 200 MB for several times and calculates the average time cost. Here are the results:
Environment | Average Time |
---|---|
Ubuntu 16.04 64-bit / g++ 5.4 | 899 ms |
Windows 10 64-bit / Visual Studio 14 2015 | 1229 ms |
#include "tastylib/NPuzzle.h" using namespace tastylib; int main() { // The beginning node and the ending node of a 3*3 puzzle problem. // Number '0' indicates the empty grid and number '1-8' denote other eight grids. PuzzleNode<> beg({0, 2, 3, 1, 4, 5, 6, 7, 8}, 3, 3); PuzzleNode<> end({1, 2, 3, 4, 0, 5, 6, 7, 8}, 3, 3); // Solve the problem NPuzzle<> puzzle(beg, end); puzzle.solve(); // List 'path' stores the move directions from the beginning node // to the ending node. Its contents must be [DOWN, RIGHT]. std::list<Direction> path = puzzle.getPath(); return 0; }
Source: benchmark_NPuzzle.cpp
The program solves 3*3
, 4*4
, 5*5
and 6*6
puzzle problems respectively and generates the information of overheads. Here are the outputs of the benchmark under different environments:
Benchmark of NPuzzle running... Benchmarking 3*3 puzzle... Beg: {0, 6, 4, 3, 8, 2, 7, 5, 1} End: {6, 5, 7, 3, 1, 2, 8, 4, 0} Searching... Searched nodes: 161 Time cost: 1 ms Efficiency: 102.092581 node/ms Path length: 52 Solution check: pass Benchmark of 3*3 puzzle finished. Benchmarking 4*4 puzzle... Beg: {3, 0, 2, 13, 9, 1, 5, 6, 14, 11, 4, 10, 12, 7, 15, 8} End: {4, 2, 9, 3, 13, 0, 5, 6, 15, 12, 11, 14, 8, 10, 1, 7} Searching... Searched nodes: 1330 Time cost: 13 ms Efficiency: 98.089830 node/ms Path length: 117 Solution check: pass Benchmark of 4*4 puzzle finished. Benchmarking 5*5 puzzle... Beg: {6, 16, 17, 0, 8, 3, 2, 11, 5, 9, 4, 21, 13, 23, 18, 15, 7, 1, 20, 14, 22, 12, 10, 19, 24} End: {12, 10, 19, 22, 2, 5, 0, 20, 3, 4, 21, 6, 18, 13, 24, 11, 1, 8, 9, 7, 15, 14, 17, 16, 23} Searching... Searched nodes: 3676 Time cost: 43 ms Efficiency: 84.968680 node/ms Path length: 275 Solution check: pass Benchmark of 5*5 puzzle finished. Benchmarking 6*6 puzzle... Beg: {15, 3, 1, 4, 5, 13, 18, 2, 14, 0, 9, 10, 8, 27, 20, 24, 23, 16, 26, 30, 6, 34, 25, 21, 31, 28, 11, 12, 7, 29, 32, 19, 35, 17, 22, 33} End: {13, 6, 9, 2, 27, 16, 11, 14, 12, 15, 21, 17, 7, 20, 32, 4, 5, 3, 10, 18, 8, 19, 29, 23, 1, 26, 25, 24, 34, 33, 31, 35, 30, 0, 22, 28} Searching... Searched nodes: 69271 Time cost: 1088 ms Efficiency: 63.638602 node/ms Path length: 450 Solution check: pass Benchmark of 6*6 puzzle finished. Benchmark of NPuzzle finished.
#include "tastylib/Sort.h" using namespace tastylib; int main() { const unsigned n = 5; int arr[n] = {5, 4, 3, 2, 1}; { // Sort. // Aftering running each of the function below, // the array's content will be: [1, 2, 3, 4, 5] insertionSort(arr, n); selectionSort(arr, n); heapSort(arr, n); quickSort(arr, 0, n - 1); } { // Find the kth smallest element. // After running the function below, the kth // smallest element will be stored at arr[k]. int k = 1; // Find the second smallest element quickSelect(arr, 0, n - 1, k); } return 0; }
Operation | Time | Stable |
---|---|---|
insertionSort() | O(n^2) | Yes |
selectionSort() | O(n^2) | No |
heapSort() | O(nlogn) | No |
mergeSort() | O(nlogn) | Yes |
quickSort() | O(nlogn) | No |
quickSelect() | O(n) | - |
Source: benchmark_Sort.cpp
The program calculates the average time cost to sort or find the kth element in an array of 100000
elements. Here are the results under different environments:
Operation | Time |
---|---|
std::sort() | 6.41 ms |
insertionSort() | 1691.64 ms |
selectionSort() | 12220.25 ms |
heapSort() | 10.61 ms |
quickSort() | 7.07 ms |
std::nth_element() | 0.79 ms |
quickSelect() | 0.86 ms |
Operation | Time |
---|---|
std::sort() | 8.34 ms |
insertionSort() | 1559.92 ms |
selectionSort() | 4355.90 ms |
heapSort() | 11.56 ms |
quickSort() | 6.79 ms |
std::nth_element() | 1.08 ms |
quickSelect() | 0.88 ms |
#include "tastylib/Dijkstra.h" #include <string> using namespace tastylib; int main() { DijkGraph<string> graph(3); // Create a graph that has three vertices graph[0].val = "Alice"; // Vertex 0 denotes Alice's home graph[1].val = "Darth"; // Vertex 1 denotes Darth's home graph[2].val = "Bob"; // Vertex 2 denotes Bob' home graph.setWeight(0, 1, 5); // Distance from Alice's home to Darth's is 5 graph.setWeight(0, 2, 20); // Distance from Alice's home to Bob's is 20 graph.setWeight(1, 2, 5); // Distance from Darth's home to Bob's is 5 // Run Dijkstra's algorithm to compute the // shortest path from Alice's home to others'. dijkstra(graph, 0); // Now I know the minimum distance from Alice's home to Bob's home, which is 10. auto distToBob = graph[2].dist; // distToBob == 10 // I also know the minimum path to Bob's home is "0->1->2", namely "Alice->Darth->Bob". auto prev1 = graph[2].prev; // prev1 == 1 auto prev2 = graph[prev1].prev; // prev2 == 0 return 0; }
See the LICENSE file for license rights and limitations.