
Find the closest point in O(logN) complexity? - heythisisom
&quot;Given a billion points p1, p2, p3,...,pN in which each point is of the form (x1,y1). Also given a point Q which is of the form (x,y). Find the closest point P from Q. Constraint: Time Complexity O(logN)&quot;. 
The Naive&#x27;s solution to this problem is of the order O(N). I&#x27;m wondering if there&#x27;s any optimization to this to get the best case order O(logN).<p>EDIT: Each point {pi} is unique.
======
jepler
We're talking euclidean distance measure, right? If you are performing many
queries against the same set of points {pi}, then yes. Create a structure such
as a BSP or quadtree containing all N points {pi} in O(n log n) time, then you
can perform a single search of it in O(log n) time. You'll have a bit of extra
bookkeeping since it's possible that points closer than the "best point so
far" could lie within more than one branch of the tree, so you can't naively
follow the usual quadtree, bounding box type algorithm.

~~~
heythisisom
That's great. Thanks for your suggestion on BSP/Quadtree and Yes, we're
talking about the Euclidean distance measure. The Inputs points {p1,p2,..,pN}
are given as an Array and to perform the search in the order O(logN), as you
said we are supposed to construct the BSP or Quadtree which takes O(NlogN)
time. when it's put together it will cost us O(NlogN) time i.e. O( NLogN +
logN ). But according to the problem, all operations have to be performed
within O(logN) time. EDIT: I'm not performing the computation on the same set
of points. Each point is unique.

~~~
brudgers
There is no general sorting algorithm that is faster than O(n log n) worst
case. But when sorting only happens once then the value of (n log n) is a
constant and the running time of search is (log n + c) where c is some
fraction of (n log n) amortized across each (log n) search.

    
    
       The big O for (log n + c) is O(log n).
    

However, you can do better by memoizing the search to get O(1) speed in
exchange for O(n) space. But that's probably not in keeping with the
parameters of the exercise.

~~~
heythisisom
I'm a beginner in Algorithms. can you please explain this "But when sorting
only happens once then the value of (n log n) is constant" ? is it a good
practice to assume that as a constant? because we usually see the problem as a
whole and individually compute its running time like sorting (NlogN)+ search
(logN) to get the final running time as O(NlogN). we can solve the problem in
O(N) time which's nothing but finding the Euclidean distance to Each Point and
keeping the Min. distance in memory. but that's regarded as a Naive solution.

~~~
brudgers
I am a beginner as well. The actual worst case time of an algorithm with big O
notation of O[log n] is [(a * log n) + c] where:

[a] represents the idea that there may be an arbitrary number of steps in the
portion of the program that is _proportional_ to [log n]

[c] represents the idea that there is a constant amount of running time
overhead that is independent of the core efficiency of the algorithm.

Big O notation is convenient because it gets rid of [a] and [c]. It can be
misleading because [a] and [c] might dominate the run time in all practical
cases. It can also be misleading because as [n] becomes large, O(n log n)
often approximates (a log n).

Here we are only doing one O(n log n) operation...so long as the sorted array
is memoized and used for future operations. Note that if the array is sorted
destructively, then the memory is [a' * n] or O(n} which is the same as having
an unsorted array (though [a'] may be bigger than the factor required without
sorting.

Anyway, I don't really know what is or isn't a naive solution. But I've read a
bit of Knuth and I don't think he would encourage looking for unneeded
complexity because algorithms and computer science are hard enough just taking
the simplest approach.

~~~
heythisisom
Oh, yeah. Thanks for the clarification.

