
GlueList – Fast New Java List Implementation - ertucetin
https://github.com/ertugrulcetin/GlueList
======
zinxq
Iteration will be slower than Arraylist.

Just like LinkedList (which is even worse), there will be cache misses on
jumps to new nodes.

Random access lookup will be (very slightly) slower as you determine which
node to look inside.

This is hardly a new datastructure. Just one that's been thought of before and
not standardized due to lack of compelling performance differentiation.

~~~
geophile
Worrying about cache misses seems premature without value types in Java. The
list elements are not physically embedded in the list, and accessing each one
could easily result in a cache miss.

~~~
Someone
A _second_ cache miss. That the language almost (1) forces the language
implementation to be inefficient doesn't give those writing programs in the
language to ignore further inefficiencies.

(1) A sufficiently smart compiler could, in some cases, convert a data
structure that logically uses reference types into one that uses value types
in the compiled code (I think that's highly hypothetical in this case, as it
almost isn't worth spending time on for compiler writers, given that it will
be possible very rarely)

------
erikb
I'm scepctical if I just read "fastest X". Fastest at what? Here it seems it
is way faster at adding elements. But what about moving elements, sorting,
searching, deleting? In most optimizations you get faster in one operation and
slower in some others.

And if you decide to optimize one operation over the others some usecases for
that would be great as well. The most expensive things I needed to do with
lists were sorting. Having a list type that improves this operation over the
others would be most reasonable for me.

Last but not least I think the List is such an old and often used data
structure, it is very unlikely one can develop something a lot faster.

------
leeter
Seems like a java implementation of std::deque
[http://en.cppreference.com/w/cpp/container/deque](http://en.cppreference.com/w/cpp/container/deque)

~~~
blaisio
Yeah that seems like exactly what it is. I kinda feel bad because it seems
like this person came up with the data structure and thought they were super
smart for "inventing" it even though most people (should) already know about
it...

------
cristaloleg
Is this a unrolled linked list implementation?
[https://en.wikipedia.org/wiki/Unrolled_linked_list](https://en.wikipedia.org/wiki/Unrolled_linked_list)

~~~
Zekio
"Unrolled linked list, In this model, the maximum number of elements is 4 for
each node." \- wiki

According to the README, it only uses 36 nodes for 10 Million records, so I
guess it is like Unrolled linked lists but with a higher limit of elements per
node?

~~~
jdmichal
There's nothing requiring that an unrolled linked list has a fixed number of
elements per node. It's just easier to reason about the performance results by
fixing the number. This looks like an unrolled linked list with variable
elements per node, which allows it to do things like allocate a single node
for an entire bulk add operation.

------
topbanana
This is fine and dandy, until your list becomes heavily fragmented. Suddenly
traversing your list becomes a series of expensive cache misses.

So it's not 'faster', it depends very much on the use case - like any data
structure.

~~~
Kymps
Does it actually matter that much in Java? In Java, a collection of objects is
already a collection of pointers, so wouldn't we get a fair number of cache
misses anyway? Or does the JVM have some cool mechanism to minimize those?

~~~
joajoa
With Java, and any other VM that doesn't specify an exact GC algorithm, it
always depends.

IIRC the train algorithm used in some JVMs improves locality. Most GCs use a
pointer-bump scheme anyways, leading to pretty good locality for objects that
have been created together.

So yes, the JVM _may_ have some pretty cool mechanisms to minimize those. I
would be also interested in G1s behavior and whether or not it improves
locality somehow.

------
leecarraher
some thoughts:

remove and add worst case complexity is O(mn) ! that is terrible, linked list
should be able to do inserts and removes in O(1).

the test of appending is faster. but is it? I would offer that it's likely
java doing some more datatype and bounds checking in their implementations,
which is inline with the marginal speedup and relatively linear growth
complexity. furthermore there are resize effects that will happen periodically
as big malloc+move events.

worse case complexity seems incorrect for worse case search and access.
assuming bins, you'd have to traverse the bin as well, so access should be
O(m+n) .

but what about a remove test, or an actual search test? this looks somewhat
like a b-tree.

~~~
peeters
> linked list should be able to do inserts and removes in O(1).

That assumes you have a reference to the node, which isn't true in Java's
LinkedList, as the nodes are encapsulated. The remove() operation in List
either takes the object you want to remove (by equality) or an index. Both
require an O(n) scan to find the node, then it can be removed in O(1).

~~~
ivan_gammel
You probably forgot that it's possible to remove current element via Iterator,
which indeed gives us O(1) for LinkedList in best case. Removal from ArrayList
is more expensive except the case when you remove the last element.

~~~
imtringued
You can always swap the current element with the last element and then delete
the last element in an unordered list.

~~~
ivan_gammel
java.util.List is ordered by definition (first sentence in Javadoc, if you
want to verify).

------
ccleve
What's old is new again:

[http://javolution.org/](http://javolution.org/)

This library has been around forever, and extends the same concept to other
collection types.

------
kingnuscodus
It seems this may have been done already: have you looked at Clojure data
structures implementations? Most, e.g vector, use a 32-branching factor with
constant access/upate time. Clojure data types are all accessible from Java.
Check it out:
[http://clojure.org/reference/data_structures#Vectors](http://clojure.org/reference/data_structures#Vectors)

------
grashalm
Bad title. Datastructures always come with trade offs. In this case Insertion
Speed and memory efficiency vs iteration and random access Speed.

------
chvid
Why is everything in the root namespace? Have you used this library for any
real-life project?

Try something like sort for a benchmark.

------
ape4
In addition to what everyone else has said, ArrayList may get faster with a
Java update.

------
blaisio
This makes me wish I could downvote things on HN. This is not new, nor is
there any significant evidence that it's "fastest" (a term that doesn't really
mean much without context anyway).

------
revertts
The benchmarks do not use caliper or jmh. This is _very_ important -
microbenchmarks on the JVM rarely do what you think they do without extra
work.

------
jkot
Using sun.misc.Unsafe to access array elements would make it much faster. Also
ArrayList with preset size is faster for inserting elements.

~~~
Reason077
_Using sun.misc.Unsafe to access array elements would make it much faster._

sun.misc.Unsafe should not make any difference to performance here. OK, you
skip bounds checks, but modern VMs are very good at optimising those away
anyway.

 _Also ArrayList with preset size is faster for inserting elements._

Exactly what I was thinking. The only case this is really going to help is
where you don't have an idea of what the size is going to be.

~~~
jkot
Decompression on byte[] is 3x faster with unsafe. I think modern JVMs are far
from optimal.

~~~
Reason077
Yes, that may be true for byte[], but there's no primitive types in play here.
Java collections hold object types only.

