

What do you do? You halt execution and wait until the previous instructions are complete. You are a processor and you see a branch. If you guess wrong too often, the train will spend a lot of time stopping, backing up, and restarting.Ĭonsider an if-statement: At the processor level, it is a branch instruction: If you guess right every time, the train will never have to stop. If you guessed wrong, the captain will stop, back up, and yell at you to flip the switch.Is there a better way? You guess which direction the train will go! Trains are heavy and have a lot of inertia, so they take forever to start up and slow down. And then you set the switch appropriately. You stop the train to ask the driver which direction they want. You have no idea which way it is supposed to go. You are the operator of a junction and you hear a train coming. Now for the sake of argument, suppose this is back in the 1800s - before long-distance or radio communication. Image by Mecanismo, via Wikimedia Commons. You are a victim of branch prediction fail. gcc optimization flag -O3 makes code slower than -O2.Why is processing an unsorted array the same speed as processing a sorted array with modern x86-64 clang?.

Related / followup Q&As about the same effect with different / later compilers and options: The code is summing up some independent terms, so the order should not matter. Why is processing a sorted array faster than processing an unsorted array?.My first thought was that sorting brings the data into the cache, but then I thought how silly that was because the array was just generated. #include ĭouble elapsedTime = static_cast(clock()-start) / CLOCKS_PER_SEC

For some strange reason, sorting the data ( before the timed region) miraculously makes the loop almost six times faster. Here is a piece of C++ code that shows some very peculiar behavior.
