27.08.2015 Views

Advanced Bash−Scripting Guide

Advanced Bash-Scripting Guide - Nicku.org

Advanced Bash-Scripting Guide - Nicku.org

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

This line occurs three times.<br />

This line occurs three times.<br />

This line occurs three times.<br />

<strong>Advanced</strong> <strong>Bash−Scripting</strong> <strong>Guide</strong><br />

bash$ uniq −c testfile<br />

1 This line occurs only once.<br />

2 This line occurs twice.<br />

3 This line occurs three times.<br />

bash$ sort testfile | uniq −c | sort −nr<br />

3 This line occurs three times.<br />

2 This line occurs twice.<br />

1 This line occurs only once.<br />

The sort INPUTFILE | uniq −c | sort −nr command string produces a frequency of<br />

occurrence listing on the INPUTFILE file (the −nr options to sort cause a reverse numerical sort).<br />

This template finds use in analysis of log files and dictionary lists, and wherever the lexical structure<br />

of a document needs to be examined.<br />

Example 12−8. Word Frequency Analysis<br />

#!/bin/bash<br />

# wf.sh: Crude word frequency analysis on a text file.<br />

# Check for input file on command line.<br />

ARGS=1<br />

E_BADARGS=65<br />

E_NOFILE=66<br />

if [ $# −ne "$ARGS" ] # Correct number of arguments passed to script?<br />

then<br />

echo "Usage: `basename $0` filename"<br />

exit $E_BADARGS<br />

fi<br />

if [ ! −f "$1" ] # Check if file exists.<br />

then<br />

echo "File \"$1\" does not exist."<br />

exit $E_NOFILE<br />

fi<br />

########################################################<br />

# main ()<br />

sed −e 's/\.//g' −e 's/ /\<br />

/g' "$1" | tr 'A−Z' 'a−z' | sort | uniq −c | sort −nr<br />

# =========================<br />

# Frequency of occurrence<br />

# Filter out periods and<br />

#+ change space between words to linefeed,<br />

#+ then shift characters to lowercase, and<br />

#+ finally prefix occurrence count and sort numerically.<br />

########################################################<br />

Chapter 12. External Filters, Programs and Commands 161

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!