Skip to main content
warning

This module is unstable. Main API is formed, but it may change a bit in the future.

Kone: Collections

This module provides revised collections API and their different fine-tuned implementations. The collections may be helpful in the implementation of more demanding algorithms. All further modules are based on the collections.

Applying the dependencies

build.gradle.kts
dependencies {
implementation("dev.lounres:kone.collections:0.0.0-experiment")
}

Instead of introduction

There are several problems with Kotlin stdlib's collections:

  1. Such collections like set or map are fixed of the default equality and hashing (and assume that the later one is always available and efficient).
  2. Arrays are always mutable and there are no immutable alternatives.
  3. Iterators API is minimalistic and glues together getting next element and moving forward, which is not that flexible for fine-tuned behaviours.
  4. Maps have no throwing method of getting a value by key in the way lists have it. It's a problem because you have to trace over map's structure twice just to retrieve a value.
  5. Etc.

So there was a decision to revise everything collections API and their implementations.

Iterators and iterables API

Iterators in Kone can be used to iterate over different kinds of structures in different ways. They may iterate over finite or infinite number of elements. They may change the structure. They may interact with the elements in two ways, forward and backward.

Iterator assumes that some set of elements is represented as a linear sequence of elements that is used to iterate over them. This sequence may be finite or infinite (it may have no first element, no last element, or both). In each moment (when it does not perform some operation) it has a cursor that points either between two elements, before the first one, or after the last one. (If there are no elements in the collection, the pointer just points nowhere.) There are 2 types of operations that iterators can perform: non-mutating and mutating ones. Non-mutating operations are:

  • hasNext — checks if there is next element in the sequence,
  • getNext — returns the next element if there is one,
  • moveNext — makes cursor jump over the next element if there is one,
  • nextIndex — returns index of the next element if the sequence is indexed (in the way that each next element has index one more then previous element) and there is next element,
  • hasPrevious, getPrevious, movePrevious, and previousIndex — like the three above but in the opposite direction.

Mutating operations are:

  • addNext — adds new element in the space the cursor is pointing on and puts cursor before the new element,
  • setNext — changes the next element if there is one,
  • removeNext — removes the next element if there is one, preserves previous element, and makes next element after the next one the next element,
  • addPrevious, setPrevious, and removePrevious — like the three above but in the opposite direction.

Iterators form an inheritance hierarchy below.

Common iterators hierrarchyCommon iterators hierrarchy

Iterables are finite structures of certain size that can be iterated using iterators mentioned above. It means that their elements can form a finite sequence that can be iterated using the rules above.

warning

It is guaranteed that after non-mutating operations, the iterator is still usable. However, there is no such guarantee for mutating operations. It means that the inner structure of the iterable may be changed in a such way that the state of the iterator may be undefined. For example, adding an element to a hashtable can provoke hashtable reconstruction that changes the sequence of elements to iterate undefinedly. But some structures can guarantee the safety of their iterators (for example, all lists can).

Also, be aware that there are a bit more kinds of iterators that are made especially for other APIs. For example, there are other iterator types for lists and for sets.

💭 Concept: "nodes" and "nodded" collections

As you know, there are several kinds of lists. Let's consider a (doubly) linked list. It has a great performance of insertion in the middle. But there is one nuance: it is that performant only in cases you have a pointer to this middle. So we have to somehow have a pointer to internal parts of the structure. And it's not the only case where we need such pointers. That's why there are nodes and nodded structures in Kone.

Node is a pointer to part of the collection's inner structure where the element is stored. Nodded collection is a collection that provides a node for each of its elements. In particular, one can get element's node the moment it added the element into the collection via specific addition method. For example of the methods, KoneMutableNoddedList.addNode or KoneMutableNoddedSet.addNode.

warning

Collections return (should return) the same node instance on any request for the same element's node. I.e. there aren't (should not be) two different instances that are nodes of the same element.

All nodes provide API to get their element, isDetached property to know if the node is no longer part of corresponding structure, and other methods.

Lists and arrays API

Lists are finite sequences of elements with indices from zero to its size exclusively. There are different kinds of lists: immutable, settable, mutable, growable mutable, and more. Each kind has two variants: usual and nodded.

List nodes API

There are three kinds of nodes: immutable KoneListNode, settable KoneSettableListNode, and mutable KoneMutableListNode.

List nodes hierrarchyList nodes hierrarchy

Immutable node provides:

  1. element and index properties to get an element at specified place and the place's index. (In particular, the element and index properties may return new values each time they are accessed. And index property may throw exception if the node is detached.)
  2. isDetached property that indicates that the node is detached and is no longer part of the list's structure.
  3. nextNode and previousNode properties to get next and previous nodes (or null if there is no one).
  4. iteratorFromAfterHere and iteratorFromBeforeHere methods that create iterators with cursors pointing on a space after and before the node respectively.

Settable node also provides a mutable version of element property that allows changing element at the corresponding place in the list. Mutable node also provides remove method that allows to detach the node from the structure removing its element from the list.

List iterators API

List iterators form a hierarchy below.

List iterators hierrarchyList iterators hierrarchy

The three interfaces KoneListIterator, KoneSettableListIterator, and KoneMutableListIterator are actually type aliases to KoneLinearIterator, KoneSettableLinearIterator, and KoneMutableLinearIterator. The bottom three interfaces KoneNoddedListIterator, KoneSettableNoddedListIterator, and KoneMutableNoddedListIterator inherit the interfaces above correspondingly and introduce methods getNextNode and getPreviousNode to get nodes of next and previous elements.

info

When iterator of a list is constructed, the iteration sequence is the list itself. And there is a guarantee that mutable iterator operations would leave iterator still usable. (But there is no guarantee that mutable list operations and other iterators operations would leave the iterator still usable.)

Lists API

There are several kinds of lists that are presented in Kone:

  1. Immutable lists &mdash lists which are read-only.
  2. Settable lists — lists of fixed size that can only accept new elements instead of old ones. Cannot be extended nor reduced.
  3. Fixed capacity lists — mutable lists that are backed by fixed arrays of fixed size. Thus, they can be mutated but with a condition that at each moment number of elements won't be higher than specified capacity.
  4. Resizable lists — mutable lists that are backed by array that will be expanded or shrunk (i.e. reinitialised with new capacity higher or lower than the previous one) depending on the current number of elements. Usually their capacity is a power of two.
  5. Growable lists — mutable lists that are backed by array that will be expanded (i.e. reinitialised with new capacity higher than the previous one) depending on the current number of elements. Usually their capacity is a power of two.
  6. GC-backed lists — mutable lists that do not contain any array in their structure. Thus, they are equivalent to (doubly) linked lists.

Each kind has two variants: usual and nodded. But on the level of interfaces, fixed capacity, resizable lists, and GC-backed lists have no differences, so they are represented with just KoneMutableList. And growable lists introduce only ensureCapacity method, so they are represented with KoneGrowableMutableList.

Lists hierrarchyLists hierrarchy

We are not going to discuss the list interfaces in details, because you can find all specific moments in API reference. But here is a short overview of the API:

  1. KoneList only provides access to elements by index and via iterators.
  2. KoneSettableList also lets to replace element by index and via iterators.
  3. KoneMutableList also lets to extend (by putting more elements) and reduce (by removing elements) it.
  4. KoneGrowableMutableList also lets to inform it that user is going to use corresponding capacity.

Arrays API

In Kone, arrays are just value classes that wrap Kotlin stdlib arrays. But they have a bit of advantage:

  1. There are KoneArray and KoneXXXArrays variants (XXX is a name of primitive type) that are immutable. It allows one to use performant arrays on JVM while providing safe API. Of course, there are mutable KoneMutableArray and KoneMutableXXXArrays variants.
  2. They inherit KoneList and KoneSettableList interfaces which makes their usage a bit more flexible.
info

Unfortunately, there is no way to inherit KoneArray by KoneMutableArray for now. (See KT-42977 for more.) So for corner cases when you want to convert a mutable array into an immutable array without extra allocations, there are asKoneArray and asKoneXXXArray extensions.

Deques API

Deques are list-like structures that allow to get, put, and remove elements from either its beginning or its end. They are represented via KoneDeque interface. There is nothing special here. See API reference for details.

💭 Concept: "contextful" collections

Let's notice one more time that "equality" is a contextual concept. Thus, the concept "belonging to the collection" is contextual as well, because it uses the equality context. That's why "belonging to the list" concept is implemented as the following extension function:

context(_: Equality<Element>)
public operator fun <Element> KoneIterable<Element>.contains(element: Element): Boolean = any { it eq element }

But what about collections like sets and maps for what contains method is a part of their nature? They have to declare the function as a method, not as an extension function. So should they have context parameter of Equality type? Well, it looks good that the method is also contextual. In that case, its logic assumes that equality is part of the context, but not part of the instance.

But do the inner structures of sets and maps think the same? Should the equality be really independent of the collection instance? If you think from collection's point of view, different equality instances can say absolutely different things about the collection's elements. There is no interrelation between the equalities. Any attempt to structure the collection in any way that would optimise the time needed to find an equal element would lead to a complete failure. So the only way to find an equal element is just to iterate over all elements of the collection one by one.

But if we want to implement some structure (a hash table or a search tree), we have to fix some equality (and maybe some other contexts). The fixed contexts should be coordinated with each other and with data structure. It means data structure does contain this context. Hence, some collections use inner equality and other contexts instead of contextual ones and require them while being constructed. Such collections are called "contextful" in Kone.

info

Unfortunately, there is no simple way to serialise contextful collections, because there is no simple way to serialise contexts. Thus, lists and arrays are serialisable whereas sets and maps are not.

Heaps API

In Kone, minimum/maximum heaps are structures that consist of:

  1. an order on values called "priorities" (thus, heap is contextful)
  2. and a tree (in computer science sense of the word) that holds a pair of one priority value and one "element" value in each its vertex and satisfies heap property: for any given vertex CC, if PP is the parent vertex of CC, then the priority of PP is less/greater than or equal to the priority of CC.

Heap is a nodded structure where each node points to a vertex of the tree. Nodes are represented via HeapNode interface. Notice that you can you can change the nodes priority that will change priority of the corresponding vertex and update the tree to satisfy the heap property. Minimum/maximum heap is represented via MinimumHeap/MaximumHeap interface.

To construct a heap, heap implementations use HeapEntry data class to represent a pair of corresponding element and priority.

Also, some heaps may also include a structure of linked lists to be iterated over. Such heaps are represented via LinkedMinimumHeap/LinkedMaximumHeap interface and which nodes are represented via LinkedHeapNode interface.

Search trees API

In Kone, search trees are structures that consist of:

  1. an order on values called "priorities" (thus, search tree is contextful)
  2. and a tree (in computer science sense of the word) that holds several pairs of one priority value and one corresponding element in each its vertex and satisfies search tree property:
    • each vertex's children are linearly ordered,
    • each non-leaf vertex contains one pair less than the number of its children,
    • for any given non-leaf vertex AA, if C0C_0, ..., CnC_n are children of AA sorted increasingly and P1P_1, ..., PnP_n are AA's priorities sorted increasingly as well, then C0P1C1...Cn1PnCnC_0 \leqslant P_1 \leqslant C_1 \leqslant ... \leqslant C_{n-1} \leqslant P_n \leqslant C_n where CPC \leqslant P (PCP \leqslant C) means that all the priorities in a subtree with root CC are less (greater) or equal to PP.

Search tree is a nodded structure where each node points to a stored in the tree pair of a priority and an element. Nodes are represented via SearchTreeNode interface. Search tree is represented via SearchTree interface.

In case you search for an element in the tree that equals to some specified element EE, you may either find the equal element or find two elements XX and YY that X<E<YX < E < Y. Formally, there are several more cases. But all of them are represented via sealed hierarchy of interface SearchSegmentResult.

Also, obviously, the elements of search tree form a sorted list. So one may want to make a search tree to contain a structure of linked list. Such search trees are represented via LinkedSearchTree interface and which nodes are represented via LinkedSearchTreeNode interface.

💭 Concept: "reified" collections

Let there be some set (in computer science sense of the word) S of elements. The main characteristic of the set is that we can check if some object is an element of the set or not. So it makes sense to declare method like

public operator fun contains(element: Element): Boolean

inside interface like

public interface Set<Element>

that would represent sets. And we want to make the Set interface covariant (after all, a set of dogs is a set of animals and so on). But contains method is not covariant as a declaration, it's contravariant. What do we do? Well, in case of Kotlin (and Java) stdlib Set, the problem is just suppressed with @UnsafeVariance annotation. And everything works as intended, right? Yeah, everything works fine. But how?! The answer is that under the hood contains method uses only equals and hashCode methods that are defined in all objects (because they are methods of Any after all).

But Kone's approach is that the logic of element equality and hashing should be injected in set instance. In other words, set instance should not use equals and hashCode methods of elements but equalsTo and hash of received Equality and Hashing instances. Will that work with our broken covariance? Usually no, it won't work! Because usually Equality<Element> uses that the received elements are of type Element by calling its unique methods. But objects that are not instances of Element cannot use these methods.

What do we do? We need to somehow say that the elements can be accepted by the Equality instance and return default value if they are not. That's... exactly the Reification's job! So the cost of covariance is actually a need in a Reification's instance.

Because of that, there are two types of potentially covariant types: invariant ones and covariant ones that require Reification instance on initialisation. The second ones are called "reified types" in Kone.

Sets API

In Kone, sets are compositions of equality context on values called "elements" (thus, sets are contextful) and an unordered collection of elements without repetitions with respect to the equality context. There are four features that sets in Kone may have or have not:

  1. mutability — ability to add or remove elements from the set,
  2. being linked — feature of containing an extra structure of a linked list,
  3. being nodded,
  4. and being reified.

The features are independent. That's why sets hierarchy contains 24=162^4 = 16 interfaces.

Set nodes API

Set nodes hierrarchySet nodes hierrarchy

The common API that set node provides is represented via KoneSetNode. It provides element property that returns corresponding element and isDetached property that indicates that the node is detached and is no longer part of the set's structure. There is mutable variant of set nodes, KoneMutableSetNode, that also lets you remove the node removing corresponding element from the set. On top of that, there are KoneLinkedSetNode and KoneMutableLinkedSetNode that describe nodes of linked sets and allow you to get next and previous nodes (if they are present) and iterators from after and from before the node. Just like in lists.

Set iterators API

Set iterators form a hierarchy below.

Set iterators hierrarchySet iterators hierrarchy

The four interfaces KoneSetIterator, KoneLinkedSetIterator, KoneMutableSetIterator, and KoneMutableLinkedSetIterator are actually type aliases to KoneIterator, KoneReversibleIterator, KoneRemovableIterator, and KoneReversibleRemovableIterator. The bottom four interfaces KoneNoddedSetIterator, KoneLinkedNoddedSetIterator, KoneMutableNoddedSetIterator, and KoneMutableLinkedNoddedSetIterator inherit the interfaces above correspondingly and introduce methods getNextNode and getPreviousNode to get nodes of next and previous elements.

info

When iterator of a linked set is constructed, the iteration sequence is the inner linked list structure itself. And there is a guarantee that removable iterator operations would leave the iterator still usable. (But there is no guarantee that mutable set operations and other iterators operations would leave the iterator still usable.)

Sets API

Again, there are four features that sets in Kone may have or have not:

  1. mutability,
  2. being linked,
  3. being nodded,
  4. and being reified.

The interfaces containing some of the features are named correspondingly and inherit interfaces of the feature subsets.

Sets hierrarchySets hierrarchy

We are not going to discuss the list interfaces in details, because you can find all specific moments in API reference.

Maps API

In Kone, maps are compositions of equality context on so-called "keys" (thus, maps are contextful) and an unordered collection of pairs consisting of a "key" and a "value" which has no repetition of keys with respect to the equality context. Maps in Kone are always nodded. There are two features that maps in Kone may have or have not:

  1. mutability — ability to add or remove pairs found by the key from the map and change value of a present pair,
  2. and being reified.
note

Most likely, there will be also "being linked" feature like in maps soon.

The features are independent. That's why maps hierarchy contains 22=42^2 = 4 interfaces.

Map entries and map nodes API

Set nodes hierrarchySet nodes hierrarchy

To construct a map, you have to use builder functions like koneMapOf. And to provide "key-value" pairs to it there is specific interface KoneMapEntry that can be instantiated with either KoneMapEntry fake constructor or mapsTo infix function (that is just like to from Kotlin stdlib). The interface is inherited by all nodes as well, so you can provide existing node to the builder functions as well.

The common API that map node provides is represented via KoneMapNode. It provides key and value properties that return corresponding key and value and isDetached property that indicates that the node is detached and is no longer part of the map's structure. There is mutable variant of set nodes, KoneMutableSetNode, that also lets you to change the value and to remove the node removing corresponding "key-value" pair from the map.

Maps API

Again, there are two features that maps in Kone may have or have not:

  1. mutability — ability to add or remove pairs found by the key from the map and change value of a present pair,
  2. and being reified.

The interfaces containing some of the features are named correspondingly and inherit interfaces of the feature subsets.

Sets hierrarchySets hierrarchy

We are not going to discuss the list interfaces in details, because you can find all specific moments in API reference.