My requirement is implementing a Queue<T> (where T is a Lexeme) that is capable of insertion sort and insertion merge.
The Lexeme interface:
/**
* A lexeme is the smallest piece of data it is possible to
* extract during the lexing process.
* For example, a lexeme could be a keyword, a comment, or an identifier.
*
* @author Edoardo Luppi
*/
interface Lexeme : Segment {
/** The IntelliJ Platform type for this lexeme. */
fun getPlatformType(): IElementType
/** The start offset for this lexeme inside the characters buffer. */
override fun getStartOffset(): Int
/** The end offset for this lexeme inside the characters buffer. */
override fun getEndOffset(): Int
/** Returns if this lexeme fully contains the inputted lexeme or not. */
fun contains(lexeme: Lexeme): Boolean
/** Returns if this lexeme intersects with the inputted lexeme or not. */
fun intersects(lexeme: Lexeme): Boolean
/**
* Creates a copy of this lexeme, shifted right (using a positive number)
* or left (using a negative number) by the given positions.
*/
fun shift(positions: Int): Lexeme
/** Creates a copy of this lexeme with different offsets. */
fun copy(startOffset: Int, endOffset: Int): Lexeme
}
What do I mean by insertion merge?
During the lexing process of a document, the document is lexed with multiple passes, and each pass detects some types of tokens. A pass' token may overlap with the preceding one on the same offsets, in which case the latest token always wins.
In the end I came up with something pretty straightforward using an ArrayList as base implementation.
The following is the important part:
/**
* A queue implementation that sort lexemes on insertion and that
* is capable of merging new ones with old ones in case they overlap.
*
* @author Edoardo Luppi
*/
class SortedArrayLexemeQueue(initialCapacity: Int = 32) :
ArrayList<Lexeme>(initialCapacity),
Queue<Lexeme> {
override fun add(element: Lexeme): Boolean {
if (isEmpty()) {
return super.add(element)
}
val index = findIndex(element)
if (index < 0) {
// The new element does not overlap with other elements,
// thus we can use the index returned by 'findIndex'
// to insert it on the right spot and keep the queue sorted
super.add(-index - 1, element)
return true
}
val iterator = listIterator(index)
while (iterator.hasNext()) {
val lexeme = iterator.next()
if (!element.intersects(lexeme)) {
continue
}
iterator.remove()
if (element.contains(lexeme) && iterator.hasNext()) {
// The new element fully override the old one, so we simply remove it
continue
}
if (lexeme.startOffset < element.startOffset) {
iterator.add(lexeme.copy(lexeme.startOffset, element.startOffset))
}
iterator.add(element)
if (lexeme.endOffset > element.endOffset) {
iterator.add(lexeme.copy(element.endOffset, lexeme.endOffset))
}
}
return true
}
/**
* Returns the index where we should start processing overlaps for this element.
* Otherwise, if the element does not overlap with existing ones, returns
* the inverted insertion point (-insertion point - 1), defined as the index
* at which the element should be inserted so that the list remains sorted.
*/
private fun findIndex(element: Lexeme): Int {
var sortedPosition = size
for ((index, lexeme) in withIndex()) {
if (element.intersects(lexeme)) {
return index
}
if (lexeme.startOffset > element.startOffset) {
sortedPosition = index
break
}
}
return -sortedPosition - 1
}
...
The expected queue size is around 10-60 elements, as it gets consumed often.
Because of the merge feature, I could not use the standard PriorityQueue.
My question is, can I do this better (considering I have a small amount of time)? If so, how?
Do you spot any possible issue?
