Data parallelism

From Simple English Wikipedia, the free encyclopedia

Data parallelism (also known as loop-level parallelism) is a form of parallel computing for multiple processors using a technique for distributing the data across different parallel processor nodes. It contrasts to task parallelism as another form of parallelism.

In a multiprocessor system where each one is executing a single set of instructions, data parallelism is achieved when each processor performs the same task on different pieces of distributed data. In some situations, a single execution thread controls operations on all pieces of data. In others, different threads control the operation, but they execute the same code.

For example, if we are running code on a 2-processor system (CPUs A and B) in a parallel computing environment, and we want to do a task on some data D, it is possible to tell CPU A to do that task on one part of D and CPU B on another part of D simultaneously (at the same time), in order to reduce the runtime of the execution.

Data parallelism is used by many applications especially data processing applications; one of the examples is database applications. Most real programs use a combination of data parallelism and task parallelism.