I was in a computer architecture class last semester, and the last chapter was about all this multiprocessor stuff. It's quite confusing and I didn't really learn it that well.
What's the big deal about it? Do multiple cores really provide the speedup they promise? Is task level parallelism a good idea, or do the separate processes need to like communicate with each other a lot?
And how does vector processing play into things? How many applications really rely on vector processing? I know a lot of this is geared at graphics and sometimes sound, but do they have much benefit to normal processing? Will it change how programming is done like in a serious way?
I guess this is kind of like the change from learning about calculus in a scalar way (Calc 1 and early Calc 2) to transitioning to a vector way of thinking (vector calculus). Which still is more complicated, I understand calculus in a scalar perspective, but I forgot a lot of those vector calc formulas, and how they relate to a scalar math world. I wish math the way it's taught was more generalized to account for vector and matrix (arbitrary R^n).
parallel processing is a big deal since they hit the 4ghz barrier in pentium 4 days. since then, it has proven impossible to consistently make cpu's stable at speeds higher than 4ghz, so, the primery way to make cpu's perform faster is parallelism- multiple pipeline, multi-core and/or hyperthreaded designs.
certain instructions are dependent upon each oter, and have to be processed in a particular order, sometimes. a problem with parallelism is when instructions get put in the pipeline in the wrong order, and thus have to be flushed to free up the pipeline while the other, required instruction is completed. this was a particular problem with pentium 4's, even in the single cores, becuase they used an advanced (at the time) "out-of-order instructional parallelism" instruction set, to take advantage of an approx. doubling of cpu pipelines in a single core. this proved adding more pipelines to be inefficient at numbers higher than 10-15, so they transitioned from adding more pipelines to reducing pipelines and instead adding more cores.
pentium 4's hyperthreading technology also proved to be problematic at times, so it was not included in the core archetecture, and only recently re-introduced in the i5's and i7's, while they perfected the technology. even with the new inproved hyper-threading technology, the physical core and "virtual" core have to share some internal cpu hardware elements, so physical multi-core technology has proven to be more effective than hyperthreading.
with increased pipelines and "out of order instructional parallelism" within a single core, the performance gain is dependent upon the cpu isntructions set, while with multi-core technology, the performance gain is software dependent, and hyper-threading is dependent upon a combination of the two. when you let the application control the parallelism, it seems to work better, because cpu instruction sets can sometimes conflict with software- especially software that was not designed with the additional instruction sets in mind.
vector processing also uses cpu instruction sets for multi-media such ass mmx and sse instructions. the idea with vector processing is differrent from "out of order instructional parallelism" because it sends not just the instruction down the popeline, but also the necesarry data along with it, to keep the cpu from having to access the ram for required data. generally parallalism divides everything into little threads to allow simultaneous processing of many small chunks of data, while vactor processing does much the opposite, and sends fewer, larger chunks down the pipeline. vector processing reduces the issue with parallelism- that being, some instructions are dependent upon the results of others, and seems to work best with general multimedia. parallelism seems to work better with encoding. each approach works differrently, depending upon the type of data being processed.