Pipeline
This article refers to the mechanical, electrical, and software systems meaning of pipeline. For pipelines used to transport fluids like water or petroleum, see pipeline transport.
The term pipeline has meaning in electrical and mechanical systems, as well as in software. In general, the term represents the concept of splitting a job into subprocesses in which the output of one subprocess feeds into the next (much like water flows from one pipe segment to the next).
| Table of contents |
|
2 Pipelined processors 3 Software pipelines |
Below is an example of a pipeline that implements a kind of spell checker for this page.
Some of the salient characteristics that distinguish Hartmann Pipeline from ordinary Unix pipes are:
LOOKUP reads records from its primary and secondary input streams and writes
records to its primary, secondary, and tertiary output streams, if each is connected.
The secondary input stream must be defined and connected.
The records in the secondary input stream are the master records.
LOOKUP first reads the master records into a buffer, where records with duplicate key fields are discarded;
the first occurrence of a key is retained.
The records in the buffer are referred to as the reference.
The records in the primary input stream are the detail records.
LOOKUP compares detail records to records in the reference.
LOOKUP writes records to three output streams, if each is connected:
This arrangement allows one to use other filters to prepare the dictionary,
or master records for input to LOOKUP from whatever source is required.
The many Input/Output filters, or drivers,
allow a Hartmann Pipe to interact directly with a variety data sources,
from files, to the system itself, and such things as TCP/IP ports.
The repertoir of filters and drivers is rich enough that one could,
for example, write a server that consisted solely of a Hartmann pipeline.
Mechanical analogy
A mechanical example of a pipeline is a washer/dryer system for clothing.
Instead of having one unit that both washes and dries, we have two units that together form a pipeline (the output of the washer enters the drier).
If washing takes 1 hour and drying takes 1 hour, the pipeline allows us to finish a full load of laundry every hour, compared to every 2 hours if you had a single (non-pipelined) unit that washed and then dried.
It still requires two hours for an item of clothing to complete its wash/dry cycle of course.Pipelined processors
Electrically, pipelines are used in microprocessors to allow complex logic sequences to execute at faster speeds. Pipelines are related to the engineering concepts of throughput and latency.
See Instruction pipeline and Classic RISC pipeline for a better discussion.Software pipelines
In computer software, a pipeline is a command line feature prevalent in UNIX and other UNIX-like operating systems.
Douglas McIlroy, one of the authors of the early UNIX command shells, noticed that much of the time they were processing the output of one program as the input to another. The UNIX pioneers established a means of chaining the running programs together as co-processes so that the output of the first program becomes the input to the second.
This was to become the famous pipes and filters design pattern.
A pipeline may be extended to any number of commands with the output of one serving as the input to the next.
Unix pipes
Commonly filter programs are used in a UNIX pipeline and they usually obey a few conventions: line structured records, reading data from the standard
input, and writing to the standard output.curl http://www.wikipedia.org/wiki/Pipeline |
sed 's/[^a-zA-Z ]//g' |
tr 'A-Z ' 'a-z\
' |
grep '[a-z]' |
sort -u |
comm -23 - /usr/dict/words
Here is an explanation of the pipeline:Hartmann pipelines
John Hartmann, a Danish engineer with IBM,
extended the basic pipes and filters paradigm in a number of useful ways.
His product, a/k/a CMS Pipelines, is available on a number of IBM platforms.
The utility of the many filters supplied with the program is exemplified by the LOOKUP filter:
key fields. The primary and secondary output streams are severed at the end of file on the primary input stream before records are written to the tertiary output stream.