User Tools

Site Tools


wiki:awkbasic

Basic AWK scripts

Exercise:
Script: open by rstudio ~/ost4sem/exercise/basic_adv_awk/basic_awk.sh
Data: ~/ost4sem/exercise/basic_adv_awk/input.txt
Directory: ~/ost4sem/exercise/basicadvawk

rstudio ~/ost4sem/exercise/basic_adv_awk/basic_awk.sh &

Predefined variables

Change directory, printing and summarise the data by predefined variables:

cd ~/ost4sem/exercise/basic_adv_awk
awk  '{ print $5 , $2 }' input.txt   # print a column 5 and 2  
awk  '{ print NF }'  input.txt       # print number of column
awk  '{ print NR }' input.txt        # print number of row

Use BEGIN and END

Print a header and tail in a file:

awk 'BEGIN { print "CODE"} { print $5 } END {print "TOT"}' input.txt

Bash and AWK

Sort base on column number 5 and then print column 5:

sort -k 5,5 -g  input.txt  |   awk  '{ print $5 }'  > output.txt 

print column 5 and then sort column number 1 (that was previously number 5)

awk  '{ print $5 }'  input.txt  |  sort -k 1,1   > output.txt 

Bash variables in AWK

Import a variable in awk and insert the awk-action in a loop

for (( i=1 ; i<=4 ; i++  )) ; do
awk  -v i=$i  '{ print $i }' input.txt  
done 

AWK operations

  • Mathematical operation:
awk  '{ print log($1) }' input.txt  

String operation:

awk  '{ print substr($1,1,4) }' input.txt 

Query operation:

awk  '{ if($3>2) print $3 }'  input.txt
awk  '{ if($3>=2) print $3 }' input.txt  
awk  '{ if($3<2) print $3 }'  input.txt

The logical operator (if condition) can be inserted also out side from the “{}”.
This is an example.
Add an index to the input file txt file:

awk 'NR==1 { print "index "$0 } NR>1 { print NR-1,$0 }' input.txt > output_withID.txt

The same operation can be done also with this syntax

awk ' { if (NR==1) { print "index "$0 } else { print NR-1,$0 }}' input.txt > output_withID.txt
  • Query operation for file management If we want to count files in a folder tmp of size bigger than 1Mb we can concatenate 3 functions piping each output as input for the next functiont
ls -l /tmp/ | awk '{s=$5; if(s>1000000){print$0}}'| wc -l
  • Split a large file into small files of 100 lines each
awk  'NR%100==1{x="blockfile"++i ; }{print > x}'  input.txt 
wiki/awkbasic.txt · Last modified: 2018/06/01 11:11 (external edit)