Treap: The Easiest Search Tree (Explained) | Alex Dremov (2024)

Cartesian tree or treap (binary search tree + binary heap) is a fast yet simple data structure. It conforms to a core search binary tree property and binary heap property at the same time. Despite its simplicity, treap self-balances, resulting in O(logn) complexity on average for all common operations.

Amazing, right?

💥

The algorithm uses random values. Therefore, O(logn) complexity is on average. However, with a lot of items O(logn) is almost always true. So, later in this article, I will use just O(logn) without "on average" addition.

Moreover, there is a modification (implicit treap, treap with implicit key) that lets you use treap as a usual array with O(logn) random insertions and random deletions. Isn't it cool? In this article I'll explain how to create one and provide the implementation in Swift. Also, I will compare treap to the general set from standard library. Let's start!

💡

In a binary search tree, for each node, all items' values in the left subtree are less than the node's value, and all items in the right subtree are greater

Core algorithm

As I said earlier, treap combines heaps and binary search trees. Therefore, we are going to store at least two properties: key (or value) and priority. Key is a value for which tree is a search tree and for the priority, it is a binary heap.

💡

A binary heap is a binary tree where each node child's value is less than the node's value

Treap: The Easiest Search Tree (Explained) | Alex Dremov (1)

On the image above, you may notice that for every node, all child's priorities are less. On the other side, all children on the left have a key less than that in the node, and all children on the right have a larger key.

💡

It's also called a cartesian tree as it can be displayed on a regular 2D grid with (key, priority) coordinate for each node. Just like in the image above.

To create a fully-functioning search tree, we need to implement:

  • find
  • insert
  • remove

More exotic operations like lower bound and upper bound are also pretty simple and does not differ from those in the other search trees. And all these operations can be implemented using just two helper operations!

How to do that?

💥

Split

Splits the tree into two trees by given value. All values in the left tree are less than the value while in the right tree are greater. And both resulting trees are correct treaps.

We will use a special flag that decides whether to send values that are equal to the left tree or to the right tree.

Treap: The Easiest Search Tree (Explained) | Alex Dremov (2)

💥

Merge

Merges two treaps into one big treap.
Prerequisite: all items in the first tree are less than items in the right tree.

Treap: The Easiest Search Tree (Explained) | Alex Dremov (3)

So, if we implement these two methods, implementing all other three operations would be trivial.

Split

Let's start thinking about code at this stage. I'm going to explain this in C++. Rewriting the following code in Swift is actually really easy. Leave a comment bellow if you need a help.

template<typename T>struct Node {T key;size_t prior;Node* left = nullptr, *right = nullptr;Node(T key, size_t prior) :key(std::move(key)),prior(prior) {}};

For split, we have a head node and a key for which split needs to be done. This method is extremely simple using recursion.

Algorithm

Let the current head be p.

  • If p->key is less than the key, then we need to go right and split p->right further.

    Also, splitting right will bring two trees as well, and the first one will have nodes with keys less than the key. Yet, they are greater than the p->key (as they are in the second tree of the first split).
    So, we set p->right to the first tree of splitting right result.

    Result: p, split right's second tree

  • If the p->key is greater or equal to the key, then we need to go left and split p->left further.

    Similarly to the case above, we set p->left to the second tree of split left.

    Result: split left's first tree, p

The algorithm above leaves a node that is equal to the split value in the second tree. Symmetrically, we will use the equalOnTheLeft flag to leave the node in the left tree.

So, the final code:

pair<Node *, Node *> split (Node *p, const T& key,bool equalOnTheLeft=false) { if (!p) // reached leaf return {nullptr, nullptr}; if (p->key < key || (equalOnTheLeft && p->key == key)) { // splitting right auto q = split(p->right, key, equalOnTheLeft); // q.first has nodes of the right // subtree that are less than key p->right = q.first; return {p, q.second}; } else { // splitting left auto q = split(p->left, key, equalOnTheLeft); // q.second has nodes of the left // subtree that are greater or equal // to the key p->left = q.second; return {q.first, p};}}

💡

Priorities are not used and not changed during the split procedure. The resulting trees have the right order of priorities as the initial tree had it right

Subscribe and don't miss posts!

Merge

Merge is similar to split, but it uses priorities to do the work. As I mentioned before, there is a prerequisite: all items in the first merged tree must be less than items in the second tree. If this is not true, another algorithm must be used.

Algorithm

Similarly to split, merge is also recursive. Let us have two trees to merge: l and r.

  • We need to choose which tree will represent the new head. That's simple — the head must have the greatest priority, so we choose l or r based on that.

💡

Notice that the head node in l has the highest priority in the whole l tree as its a property of correct treap. The same applies to r.

  • If l has greater priority, then l->left subtree will remain intact as left subtree for sure less than r and it has nothing to do with it.

    Then, l->right subtree must be merged with r and it's going to be the new l->right subtree.

  • If r has greater priority, then, similar to the example above, r->right will remain intact and r->left must be merged with l
Node* merge (Node *l, Node *r) { if (!l) // left is empty return r; if (!r) // right is empty return l; if (l->prior > r->prior) { // l has the new head. l->right = merge(l->right, r); return l; } else { // r has the new head. r->left = merge(l, r->left); return r; }}

Why is it correct?

It seems like nothing stops us from breaking the search tree structure where all items' values in the left subtree are less than the node's value, and all items in the right subtree are greater.

💡

Prerequisite saves binary search tree property as items are never reordered and l < r property is always kept the same

Implementing search tree methods

You believed me that all methods are easy to implement through split and merge. Time to prove that.

Find

Find is implemented just like for the general search tree. We use the fact that keys in the left subtree are greater than the value in the node.

Node* find(Node* node, const T& key) {if (node == nullptr)return nullptr; if (node->key == key)return node; return find(key >= node->key ? node->right : node->left, key);}

Insert

Let's think about insert in terms of split and merge. We have one big tree and we need to insert a new key.

  • Split the tree by key to new trees: first and second. Then, we will have two trees: the first (which has values lower than the key) and the second (which has values greater or equal to the key).

    We can check that node already exists: try to find it in the right tree.

💥

Implementation requires that each item is met only once.

If you need to insert multiple copies of the same item, you can store an item and it's count to achieve that

  • Create a new node that will store the new keynewNode. Ta-da this node is a correct treap that has only one node.

    For the new node, you need to set a random priority

💥

Random priorities are key to the complexity. This makes the cartesian tree balance itself, making O(logn) complexity for all operations

  • New head will be merge(first, merge(newNode, second))

See? It's that simple.

Treap: The Easiest Search Tree (Explained) | Alex Dremov (4)
Node* insert(Node* head, T key) { auto split = split(head, key); if (find(split.second, key) != nullptr) { // Key exists already // Merge back return merge(split.first, split.second); } auto newNode = new Node(std::move(key), rand()); return merge(split.first, merge(newNode, splitsplitted.second));}

Remove

It's very similar to insert. However, that's where the equalOnTheLeft flag is used.

💡

Remember that the second tree produced by split contains items greater or equal to the selected key

Therefore, the second tree will contain the value that needs to be removed. But how to remove it from the tree?

Split again.

We can split the second tree by key, setting the equalOnTheLeft flag to true. Thus, the node will be separated from the second tree to the new tree.

💡

After conducting two splits and separating deleted node, unneded node is easely removed everything else is merged.

Treap: The Easiest Search Tree (Explained) | Alex Dremov (5)
Node *remove(Node *head, const T &key) { auto split = split(head, key); if (split.second) { auto secondSplit = split(split.second, key, /*equalOnTheLeft=*/true); // Key exists, so delete it and merge auto everythingElse = secondSplit.second; if (secondSplit.first == nullptr) { // There's no element equal to key. Merge back. return merge(split.first, everythingElse); } // We got node with key value in // secondSplit.first delete secondSplit.first; size--; return merge(split.first, everythingElse); } // Key is not presented. Merge back. return merge(split.first, split.second);}

Full code

You can download C++ code of a little bit optimized Treap here:

Treap C++ code allocations-optimised treap.h 5 KB download-circle

Comparing to std::set

First of all, the implemented version of treap utilizes split and merge methods. Note that there is more efficient implementation that uses rotations. However, the true power of treap is in split and merge methods as other search trees can't do it easily.

Find tests

Treap: The Easiest Search Tree (Explained) | Alex Dremov (6)

It's visible that asymptotics is similar. Though, treap always has the greater overhead. Still, it's a good result! We're competing with an utterly optimized standard library data structure.

Inserts

Treap: The Easiest Search Tree (Explained) | Alex Dremov (7)

Insertion has even bigger overhead. And it was expected: recursive calls of merge and split do not improve performance ;)

TreapProject Comparisons tests. Outputs CSV of time measurments TreapProject.zip 7 KB download-circle

Comparison conclusion

Treap: The Easiest Search Tree (Explained) | Alex Dremov (8)

As you see, treap has higher nodes' height on average than that of very well-balanced AVL tree.

Yes, treap has worse performance than that of std::set. Yet, the results are comparable, and with a large data size, treap gets closer and closer to std::set which in fact is a red and black tree.

Believe me, you don't want to write your own RB tree. It's a nightmare.

Use cases and modifications

We developed this data structure not just to lose std::set. There are several useful applications.

Sum of numbers in the interval

We need to modify Node structure, adding sum field. It will store sum of all its children and itself.

template<typename T>struct Node {T key;size_t prior;long long sum;Node* left = nullptr, *right = nullptr;Node(T key, size_t prior) :key(std::move(key)),prior(prior) {}};

It's extremely easy to update the sum. Every time childs are changed, sum = left->sum + right->sum. So, you can implement some kind of update function and call it in split and merge right before returning value. That's it.

How to answer on request?

We receive interval [l, r]. To calculate the sum of numbers on this interval, we can split the tree by l, then split the second tree of the result by r+1 (or by r, leaving equal elements on the left). In the end, we will have a tree containing all added numbers in the interval [l, r].

Complexity: O(logn) versus O(n) naive.

Using a hash of value in place of priority

You can use a hash of value as a priority as a good hash function is pretty random. What benefits does it bring?

If keys and priorities are fixed, then no matter how you construct the treap or add elements, it's always going to have the same structure.

💡

You may think about it this way: keys fix x axis and priorities fix y axis of treap

Therefore, you can compare two sets in O(n) as treaps containing the same values will have absolutely the same structure.

Implicit treap

What if we use the size of the left subtree as a key? Then, we can use this key as an index. Wow. That means that we can represent a regular ordered array as a treap!

By doing this, we can:

  • make insertions by random index
    O(logn) versus O(n) naive
  • make deletions by random index
    O(logn) versus O(n) naive

With great power, comes great responsibility.

😡

Access by random index downgrades to O(logn) versus O(1) in the standard array.

If your algorithm requires a lot of array modifications and very few accesses/outputs, then it's the right choice. Moreover, you can convert treap into an array and back with O(n) complexity.

💥

I have implicit treap implemented in Swift. It behaves just like the general array and implements a lot of optimisations. Check it out!

swift-collections/Sources/OrderedCollections/TreeArray at main · AlexRoar/swift-collectionsCommonly used data structures for Swift. Contribute to AlexRoar/swift-collections development by creating an account on GitHub.GitHubAlexRoar

Cut-paste problem

Imagine that you have a big string and you recieve requests to cut some part and to insert it somwhere.

This problem can be solved using treaps with implicit key. You can use splits to cut needed part and merge to insert it.

FAQ

Cartesian trees are most suitable for what?

Treap is useful when you need to collect some kind of characteristic on an interval (for example, sum) or apply some modification to the interval. Treap with implicit key is also useful when you need to apply a lot of random tree insertions/deletions with few accesses.

Why don't we use array indices as keys for an implicit treap?

Because in case of insertion we would need to recalculate all indeces that are higher than inserted index. Therefore, it downgrades complexity to O(n).

Is treap a randomized tree?

Yes, it is. But it can also use hash value in place of a random value.

I know about implementation without split and merge. It utilizes left and right turns. Is it better?

For example, GeeksforGeeks use such implementation, I know. But I believe that the true value of treap is in seampless splits and merges. You've already seen by examples how it is really usefull. Why implementing treap with turns when you can build AVL that's probably going to be faster?

Love data structures?

Check out my article on the amazing Skip List! While a lot of people never heard about it, Skip List is beautiful and can solve, for example, the problem of finding the n-th maximum or the rolling median problem in the most efficient way.

Skip List Indexation and kth Maximum | Alex DremovSkip List is a nice structure that lets you to perform insertions, searches, and finding n-th maximum. In this post I fokus on skip list indexationAlex DremovAlex Dremov

Also, you can check the whole algorithms section of my blog

Alex Dremov | AlgorithmsThose are hard! In this section I discuss algorithms that I encountered during work or my college assignmentsAlex Dremov

References

Introduction To Algorithms

The first edition won the award for Best 1990 Professional and Scholarly Book in Computer Science and Data Processing by the Association of American Publishers.There are books on algorithms that are rigorous but incomplete and others that cover masses of material but lack rigor. Introduction to Algo…

Google Books

Treap: The Easiest Search Tree (Explained) | Alex Dremov (15)

Декартово дерево - АлгоритмикаАлгоритмика

https://www.cs.cmu.edu/~scandal/papers/treaps-spaa98.pdf

Share

Subscribe to Alex Dremov

Get the email newsletter and receive valuable tips to bump up your professional skills

your-email@example.com Subscribe
Treap: The Easiest Search Tree (Explained) | Alex Dremov (2024)

FAQs

Treap: The Easiest Search Tree (Explained) | Alex Dremov? ›

The basic structure of a treap is that of a binary tree, with each node containing a key and a priority. The keys are ordered to fit the BST property, where each node has a key greater than the keys of all nodes in its left subtree, and smaller than the keys of all nodes in its right subtree.

Is treap a complete binary tree? ›

The basic structure of a treap is that of a binary tree, with each node containing a key and a priority. The keys are ordered to fit the BST property, where each node has a key greater than the keys of all nodes in its left subtree, and smaller than the keys of all nodes in its right subtree.

What is the point of a treap? ›

Like Red-Black and AVL Trees, Treap is a Balanced Binary Search Tree, but not guaranteed to have height as O(Log n). The idea is to use Randomization and Binary Heap property to maintain balance with high probability.

What is the treap insert algorithm? ›

TREAP-INSERT first inserts an item in the tree using the normal binary search tree insert and then performs a number of rotations to restore the min-heap property. The normal binary-search-tree insertion algorithm TREE-INSERT always places the new item at a new leaf of tree.

How to construct a priority search tree? ›

For a given set of points S, we create a priority search tree as follows: If S is empty, we return a NULL pointer and do not continue. The point P in S with the the greatest Y-coordinate becomes the root R. If S − P is empty, R is also a leaf; we return R and do not continue.

Can you explain what complete binary trees are? ›

A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible. It can have between 1 and 2h nodes at the last level h. A perfect tree is therefore always complete but a complete tree is not always perfect.

What is a scapegoat tree? ›

In computer science, a scapegoat tree is a self-balancing binary search tree, invented by Arne Andersson in 1989 and again by Igal Galperin and Ronald L. Rivest in 1993. It provides worst-case lookup time (with as the number of entries) and. amortized insertion and deletion time.

What is a tango tree? ›

A tango tree is a type of binary search tree proposed by Erik D. Demaine, Dion Harmon, John Iacono, and Mihai Pătrașcu in 2004. It is named after Buenos Aires, of which the tango is emblematic.

What is the tightest worst case time complexity of finding an element in a treaps containing n elements? ›

Treap
Time complexity in big O notation
OperationAverageWorst case
SearchO(log n)O(n)
InsertO(log n)O(n)
DeleteO(log n)O(n)
2 more rows

Is treap used to solve connectivity problems? ›

Treap is a very useful data structure in computer science, it is used to solve various problems like connectivity problems, range query problems and can be used as good alternatives to segment trees/sparse tables and splay trees in some of the scenerios as they are relatively also easier to code.

How to split a treap? ›

Splitting. The split method takes in a pointer to the root of a treap root and a value x, and returns two treaps denoted as left and right. Like the name suggests, it splits the tree such that all nodes in left have keys less than or equal to x and all nodes in right have keys greater than x.

How to construct a treap? ›

It means that to build a treap given a set of n nodes we just have to make n insertions. If we have all the keys sorted, we can build the treap in O(n) time complexity instead. The idea is to split the array into two parts, from 0 to m-1 and from m+1 from n-1, where n is the size of the array and m is n/2.

How to build a cartesian tree? ›

To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees from the subsequences before and after this number. It is uniquely defined as a min-heap whose symmetric (in-order) traversal returns the original sequence.

How do you make a search tree? ›

Construct a Binary Search Tree
  1. Set the current node to be the root node of the tree.
  2. If the target value equals the key value for the current node, then the target value is found. ...
  3. If the target value is less than the key value for a node, make the current node the left child.

What is tree search algorithm? ›

Tree traversal, also known as tree search, is a process of visiting each node of a tree data structure. During tree traversal, you visit each node of a tree exactly once and perform an operation on the nodes like checking the node data (search) or updating the node.

Is a priority queue a search tree? ›

A priority queue is a data structure that stores a collection of elements and supports two operations: insertion and removal of the element with the highest priority. One way to implement a priority queue is using a binary search tree.

How do you know if a binary tree is complete? ›

Algorithm
  1. Create an empty queue and enqueue the root node.
  2. Use a flag variable of boolean type to mark the end of full nodes.
  3. Loop till the queue is empty.
  4. Dequeue front node.
  5. If we encounter a non-full node before and the current node is not a leaf then a tree cannot be complete.
Mar 27, 2024

Is a treap a combination of a tree and a heap? ›

Treap data structure is a hybrid of a binary search tree and a heap. The treap and the randomized binary search tree are two binary search tree data structures that keep a dynamic set of ordered keys and allow binary searches among the keys.

Is decision tree a full binary tree? ›

As we can see from the sklearn document here, or from my experiment, all the tree structure of DecisionTreeClassifier is binary tree. Either the criterion is gini or entropy, each DecisionTreeClassifier node can only has 0 or 1 or 2 child node.

Is binary heap a complete tree? ›

Shape property: a binary heap is a complete binary tree; that is, all levels of the tree, except possibly the last one (deepest) are fully filled, and, if the last level of the tree is not complete, the nodes of that level are filled from left to right.

Top Articles
Latest Posts
Article information

Author: Rubie Ullrich

Last Updated:

Views: 5834

Rating: 4.1 / 5 (72 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Rubie Ullrich

Birthday: 1998-02-02

Address: 743 Stoltenberg Center, Genovevaville, NJ 59925-3119

Phone: +2202978377583

Job: Administration Engineer

Hobby: Surfing, Sailing, Listening to music, Web surfing, Kitesurfing, Geocaching, Backpacking

Introduction: My name is Rubie Ullrich, I am a enthusiastic, perfect, tender, vivacious, talented, famous, delightful person who loves writing and wants to share my knowledge and understanding with you.