Tutorial Three

Selecting Records

awk opens a file and reads it serially, one line at a time. In the example from tutorial two no particular pattern was specified so awk performed the print action for every line of the file.

By specifying a pattern, we can select only those lines that contain a certain string of characters. For example we could use a pattern to display all countries from our data file which are situated within Europe.

Canada:3852:25:North America
USA:3615:237:North America
Brazil:3286:134:South America
England:94:56:Europe
France:211:55:Europe
Japan:144:120:Asia
Mexico:762:78:North America
China:3705:1032:Asia
India:1267:746:Asia

To do this we would use the following awk command...

The advantage of record selection within awk comes when you want to perform formatting functions or other processing on the records you have selected. A string of characters placed between forward slashes (//) is called a regular expression. Any occurence of that pattern within a line will cause it to be selected. If you want to select records on the basis of data in a particular field, you can use a matching operator such as the equal signs:

In this example, the third field (which tell us each countries population) is tested against the value 55, and one record is selected. Each numbered field is referred to as $1, $2, $3, etc... $0 refers to all the fields that make a particular line. As the default field seperator is usually a space we had to specify the colon as the field seperator using the -F: option.

Two equals signs are used to make the above comparison. The reason there are two is to distinguish between a comparison and an assignment.

Other matching operators are :-

==	equal to
!=	not equal to
>	greater than
<	less than
>=	greater than or equal to
<=	less than or equal to
After learning the above you can test yourself with tutorial three's question based exercise.

[Help] [Provide some feedback] [Go to Previous Page]