Page 1 of 2 12 LastLast
Results 1 to 15 of 19

Math Help - Fastest sorting algorithm

  1. #1
    Senior Member TriKri's Avatar
    Joined
    Nov 2006
    Posts
    357
    Thanks
    1

    Fastest sorting algorithm

    Hi!

    I have heard that it has been proved that the fastest general purpose sorting algorithm, cannot have an average time complexity better than \mathcal{O}(n\ log(n)) using normal hardware. Even though I haven't heard of an algorithm that has a better time complexity before, I can't see why it shouldn't be possible, and I can't think of any way to prove this sentence. Have you heard of it before and do you know how it has been proved or how to prove it?
    Follow Math Help Forum on Facebook and Google+

  2. #2
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by TriKri View Post
    Hi!

    I have heard that it has been proved that the fastest general purpose sorting algorithm, cannot have an average time complexity better than \mathcal{O}(n\ log(n)) using normal hardware. Even though I haven't heard of an algorithm that has a better time complexity before, I can't see why it shouldn't be possible, and I can't think of any way to prove this sentence. Have you heard of it before and do you know how it has been proved or how to prove it?

    Doing some research on this it appears to be untrue.

    The O(n \log(n)) limit on time complexity is for sorts based on comparisons (thought in the sort time I have been looking into this I have not seen a proof). It turns out that there are faster sorts (bucket sort for one) which have under certain conditions on the input data have an expected time complexity of O(n)

    RonL
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Senior Member TriKri's Avatar
    Joined
    Nov 2006
    Posts
    357
    Thanks
    1
    You're right. But the proof must have been done under certain conditions. Like in this case, the number can only be within a certain range, since you got to be able to set the range of each bucket. But what about if the numbers are unlimited?
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Junior Member
    Joined
    Feb 2008
    Posts
    51
    Quote Originally Posted by TriKri View Post
    You're right. But the proof must have been done under certain conditions. Like in this case, the number can only be within a certain range, since you got to be able to set the range of each bucket. But what about if the numbers are unlimited?

    If the numbers are unlimited, how would you store them all in the computer architecture of today? That would require unlimited storage space - which believe it or not; we don't have.
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Grand Panjandrum
    Joined
    Nov 2005
    From
    someplace
    Posts
    14,972
    Thanks
    4
    Quote Originally Posted by shawn View Post
    If the numbers are unlimited, how would you store them all in the computer architecture of today? That would require unlimited storage space - which believe it or not; we don't have.

    That is not what unlimited means in this context, it means that the potential
    range of the data to be sorted is unlimited.

    RonL
    Follow Math Help Forum on Facebook and Google+

  6. #6
    Junior Member
    Joined
    Feb 2008
    Posts
    51
    Quote Originally Posted by CaptainBlack View Post
    That is not what unlimited means in this context, it means that the potential
    range of the data to be sorted is unlimited.

    RonL
    I know, but even if it was sorting two numbers, any of these two given numbers to be sorted could in theory be so large that it could take unlimited amount of storage space. I mean, unless by unlimited...he does actually mean limited by what the hardware today can represent.

    My suggestion, try coding one that can take any arbitrary set of data (that is comparable) and sort it faster.
    And also, we're talking worst case...and depending on the data set you have, it might perform at the best case most of the time (but still you can't prove that this is the case all the time) while on others it performs the worst case complexity most of the time.
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Senior Member TriKri's Avatar
    Joined
    Nov 2006
    Posts
    357
    Thanks
    1
    Well, if we're talking limited disc space, we could make the conclusion that we can only have a limited numbers of elements to sort, so the running time will be limited and then all sorting algorithms will run in constant time.
    Follow Math Help Forum on Facebook and Google+

  8. #8
    Junior Member
    Joined
    Feb 2008
    Posts
    51
    Quote Originally Posted by TriKri View Post
    Well, if we're talking limited disc space, we could make the conclusion that we can only have a limited numbers of elements to sort, so the running time will be limited and then all sorting algorithms will run in constant time.
    Untrue..
    Try coding one..or even just coming up with an algorithm for one that sorts anything in constant time. Maybe you are thinking access speed or searching?

    Constant time means that for each element in the set of what your sorting only has to be moved once..so if you swap two elements, then you can't ever touch either of them again.
    Follow Math Help Forum on Facebook and Google+

  9. #9
    Junior Member
    Joined
    Feb 2008
    Posts
    51
    here's a link for you, theirs a whole bunch of algorithms for sorting:

    Merge sort is usually O(n logn) but they coded it differently to save space..

    http://www.cs.ubc.ca/spider/harrison...ting-demo.html
    Follow Math Help Forum on Facebook and Google+

  10. #10
    Senior Member TriKri's Avatar
    Joined
    Nov 2006
    Posts
    357
    Thanks
    1
    Quote Originally Posted by shawn View Post
    Untrue..
    Try coding one..or even just coming up with an algorithm for one that sorts anything in constant time. Maybe you are thinking access speed or searching?

    Constant time means that for each element in the set of what your sorting only has to be moved once..so if you swap two elements, then you can't ever touch either of them again.
    \mathcal{O}(f(x)) = f(x)\cdot l(x), where l(x) is a limited function, which means there is a limit a such that |l(x)| \leq a for all valid x. So \mathcal{O}(1) which is the same as constant time, means that there is a limit the execution time will never exceed. What you mean is probably \mathcal{O}(n), or linear time, even though that doesn't imply a limit of one movement per element.
    Follow Math Help Forum on Facebook and Google+

  11. #11
    Super Member angel.white's Avatar
    Joined
    Oct 2007
    Posts
    723
    Awards
    1
    For the bucket sort, you could first determine the size of the largest element. ie:
    • Have a variable "flagbig" which keeps track of the largest item in the list.
    • Have a variable "flagsmall" which keeps track of the smallest item in the list.
    • Initialize flagbig and flagsmall to the first item's value. (better choice than zero, because it could be that the elements are evenly split between positive and negative values. Of course, if you are dealing with a different type of data than numbers, you may not have to worry about such situations)
    • Have a variable "count" which keeps track of how many items are in the list.
    • Initialize count to zero
    • Have a variable "current" which keeps track of which element is being evaluated.
    • current equals zero (or in a linear list, current points to first node)


    (note that if you are using an array, you can use current to track the count if you are using it as an integer value of your location in the array rather than a pointer to an array slot)

    while current is less than or equal to the number of elements (or in a linear list, while next node != NULL)
    • if the value at current is greater than flagbig, then flagbig equals the value at current
    • if the value at current is smaller than flagsmall, then flagsmall equals the value at current
    • increment current (or in a linear list: point current to next node)
    • increment count
    end loop

    Now you know the size range, flagsmall - flagbig, and you know how many items you have, count. You can then use this information to set up efficient choices for your buckets.

    This process adds n loops to your time, though. But it's my understanding that big O notation ignores "trivial" time differences anyway.
    Quote Originally Posted by CaptainBlack View Post
    (thought in the sort time I have been looking into this I have not seen a proof).
    Freudian slip! Someone has sorting on the brain.
    Follow Math Help Forum on Facebook and Google+

  12. #12
    Junior Member
    Joined
    Feb 2008
    Posts
    51
    Quote Originally Posted by TriKri View Post
    \mathcal{O}(f(x)) = f(x)\cdot l(x), where l(x) is a limited function, which means there is a limit a such that |l(x)| \leq a for all valid x. So \mathcal{O}(1) which is the same as constant time, means that there is a limit the execution time will never exceed. What you mean is probably \mathcal{O}(n), or linear time, even though that doesn't imply a limit of one movement per element.
    You have n number of elements to sort...this remains contant, in order to ensure they are sorted, you have to visit each element more than once and in the best (worst) case O(n logn).
    Take a look at merge sort, and see how it manages to get O(n logn) and how you cannot get any better.

    /**
    * Mergesort algorithm.
    * @param a an array of Comparable items.
    */
    public static void mergeSort( Comparable [ ] a ) {
    Comparable [ ] tmpArray = new Comparable[ a.length ];
    mergeSort( a, tmpArray, 0, a.length - 1 );
    }

    /**
    * Internal method that makes recursive calls.
    * @param a an array of Comparable items.
    * @param tmpArray an array to place the merged result.
    * @param left the left-most index of the subarray.
    * @param right the right-most index of the subarray.
    */
    private static void mergeSort( Comparable [ ] a, Comparable [ ] tmpArray,
    int left, int right ) {
    if( left < right ) {
    int center = ( left + right ) / 2;
    mergeSort( a, tmpArray, left, center );
    mergeSort( a, tmpArray, center + 1, right );
    merge( a, tmpArray, left, center + 1, right );
    }
    }

    /**
    * Internal method that merges two sorted halves of a subarray.
    * @param a an array of Comparable items.
    * @param tmpArray an array to place the merged result.
    * @param leftPos the left-most index of the subarray.
    * @param rightPos the index of the start of the second half.
    * @param rightEnd the right-most index of the subarray.
    */
    private static void merge( Comparable [ ] a, Comparable [ ] tmpArray,
    int leftPos, int rightPos, int rightEnd ) {
    int leftEnd = rightPos - 1;
    int tmpPos = leftPos;
    int numElements = rightEnd - leftPos + 1;

    // Main loop
    while( leftPos <= leftEnd && rightPos <= rightEnd )
    if( a[ leftPos ].compareTo( a[ rightPos ] ) <= 0 )
    tmpArray[ tmpPos++ ] = a[ leftPos++ ];
    else
    tmpArray[ tmpPos++ ] = a[ rightPos++ ];

    while( leftPos <= leftEnd ) // Copy rest of first half
    tmpArray[ tmpPos++ ] = a[ leftPos++ ];

    while( rightPos <= rightEnd ) // Copy rest of right half
    tmpArray[ tmpPos++ ] = a[ rightPos++ ];

    // Copy tmpArray back
    for( int i = 0; i < numElements; i++, rightEnd-- )
    a[ rightEnd ] = tmpArray[ rightEnd ];
    }

    If you think their is a better algorithm than something like merge sort on any given data set, please provide it.
    Last edited by shawn; April 7th 2008 at 08:25 AM.
    Follow Math Help Forum on Facebook and Google+

  13. #13
    Super Member angel.white's Avatar
    Joined
    Oct 2007
    Posts
    723
    Awards
    1
    Is there a more descriptive notation than big O? It bothers me how it is so generalized. Like if I understand it correctly, two sorts could have the same big O value, say n^2, but one could have twice as many steps involved as the other, meaning it takes 4x more steps to pull off. This is a huge difference that big O wouldn't account for.

    ...or maybe I don't quite understand how it works?
    Follow Math Help Forum on Facebook and Google+

  14. #14
    Senior Member TriKri's Avatar
    Joined
    Nov 2006
    Posts
    357
    Thanks
    1
    Quote Originally Posted by angel.white View Post
    Is there a more descriptive notation than big O? It bothers me how it is so generalized. Like if I understand it correctly, two sorts could have the same big O value, say n^2, but one could have twice as many steps involved as the other, meaning it takes 4x more steps to pull off. This is a huge difference that big O wouldn't account for.

    ...or maybe I don't quite understand how it works?
    I think you have understod how it works. There should be some more descriptive notation, but I don't know of it, I'll look it up. Another thing about big O notation is that every function f = \mathcal{O}(n\ log(n)) is also = \mathcal{O}(n^2) for n over a positive constant, since n^2 rises quicker against infinity than n\ log(n), you can derive it from the definition I just posted. Note that only because a function f = \mathcal{O}(n^2) it doesn't imply that f \neq \mathcal{O}(n\ log(n)). f(x) = \mathcal{O}(g(x)) only means that there is a constan factor a you could multiply g(x) with so that |f(x)| \leq a\cdot |g(x)|.
    Follow Math Help Forum on Facebook and Google+

  15. #15
    Junior Member
    Joined
    Feb 2008
    Posts
    51
    Quote Originally Posted by TriKri View Post
    I think you have understod how it works. There should be some more descriptive notation, but I don't know of it, I'll look it up. Another thing about big O notation is that every function f = \mathcal{O}(n\ log(n)) is also = \mathcal{O}(n^2) for n over a positive constant, since n^2 rises quicker against infinity than n\ log(n), you can derive it from the definition I just posted. Note that only because a function f = \mathcal{O}(n^2) it doesn't imply that f \neq \mathcal{O}(n\ log(n)). f(x) = \mathcal{O}(g(x)) only means that there is a constan factor a you could multiply g(x) with so that |f(x)| \leq a\cdot |g(x)|.
    There is little O, theta, and Big and little Omega - They just bound the running time by above, below or the amoritized average.
    More Notation-Theta and Little Oh

    If you want to know the exact running time on a portion of code that doesn't depend on any number of elements, you can just count it.
    If you want actual running times, code it and use analysis tools like j-rat for java.

    Remember, in computers, each system is different so even if you say something will run in O(1), that just means there is one operation..it doesn't tell you how fast or slow it will actually be when you run it on a computer.
    Follow Math Help Forum on Facebook and Google+

Page 1 of 2 12 LastLast

Similar Math Help Forum Discussions

  1. Algorithm(Searching and Sorting)
    Posted in the Discrete Math Forum
    Replies: 1
    Last Post: May 21st 2009, 11:41 PM
  2. Replies: 4
    Last Post: May 10th 2009, 09:29 AM
  3. Question on algorithm (Sorting)
    Posted in the Discrete Math Forum
    Replies: 3
    Last Post: January 26th 2009, 08:35 AM
  4. sorting fraction with linear time algorithm
    Posted in the Discrete Math Forum
    Replies: 1
    Last Post: May 11th 2008, 08:39 PM
  5. sorting algorithm
    Posted in the Advanced Math Topics Forum
    Replies: 2
    Last Post: November 26th 2007, 07:12 AM

Search Tags


/mathhelpforum @mathhelpforum