OPS145 Lab 9: Difference between revisions

From Littlesvr Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
Line 1: Line 1:
=USING SED & AWK UTILTIES=
= Bash scripting =
<br>
Bash is the shell you've been using in this course to run Linux commands. Bash is also a programming language. It's a special purpose programming language, not something you would write a graphical application in. What you do in a bash script is essentially the exact equivalent of what you would run at a terminal prompt - except you can run all your commands at once, instead of one at a time.
===Main Objectives of this Practice Tutorial===


:* Use the '''sed''' command to '''manipulate text''' contained in a file.
A bash script is a plain text file. The bash programming language is interpreted (as opposed to compiled) - meaning the code you write doesn't need to be compiled before you can execute it.


:* List and explain several '''addresses''' and '''instructions''' associated with the '''sed''' command.
= Setup =
The setup for writing a bash script is minimal. You'll need to:


:* Use the '''sed''' command as a '''filter''' with Linux pipeline commands.
# Create the script in a plain text editor: either graphical or on the command line. Usually you save it with a '''.sh''' extension, though technically you don't have to.
# Make sure you (and anyone else you want to allow to execute the script) have '''read''' and '''execute''' permissions for the file.
# Add a '''shebang''' line at the top.


:* Use the '''awk''' command to '''manipulate text''' contained in a file.
== Permissions ==
Since bash scripts are interpreted (rather than executed outright): you can't actually execute a bash file. In order to execute the commands in a bash script: they need to be read, and interpreted, and executed, by the bash program.


:* List and explain '''comparison operators''', '''variables''' and '''actions''' associated with the '''awk''' command.
That's why just giving a script execute permissions may not be enough to run it. You need to give yourself read permission, so that the bash program can read your script and execute it.


:* Use the '''awk''' command as a '''filter''' with Linux pipeline commands.
In fact you don't even need execute permissions to run a bash script. You can run '''bash''', and give it the name of the script as the first argument.  
<br><br>


===Tutorial Reference Material===
== Shebang line ==
Because bash scripts are interpreted, and extensions are mostly ignored in Linux: the shell you're using to execute your script needs to know what kind of script it is. There are many interpreted programming languages. If you don't make it clear what language your script is written in: there's a chance it will be misinterpreted.


{|width="100%" cellspacing="0" cellpadding="10"
A shebang line for a bash script looks like this:<syntaxhighlight lang="bash">
#!/bin/bash


|- valign="top"
</syntaxhighlight>It has the be the first line in your script.


|colspan="2" style="font-size:16px;font-weight:bold;border-bottom: thin solid black;border-spacing:0px;padding-left:15px;"|Linux Command/Shortcut Reference<br>
Anything following this line is regular bash.


|- valign="top" style="padding-left:15px;"
= hello.sh =
|  style="padding-left:15px;" |'''Text Manipulation:'''
Decide for yourself whether you can handle the bash learning in this lab while using vi. If you feel that's too hard: you can use a graphical text editor.
* [https://www.digitalocean.com/community/tutorials/the-basics-of-using-the-sed-stream-editor-to-manipulate-text-in-linux Purpose of using the sed utility]
* [https://www.digitalocean.com/community/tutorials/how-to-use-the-awk-language-to-manipulate-text-in-linux Purpose of using the awk utility]


|  style="padding-left:15px;" |'''Commands:'''
* Create the '''lab9''' directory inside your home directory.
* [https://man7.org/linux/man-pages/man1/sed.1p.html sed]
* Open a text editor and save an empty file into '''~/lab9/hello.sh'''
* [https://man7.org/linux/man-pages/man1/awk.1p.html awk]
* Add the '''shebang''' line at the top
* Look at the permissions for your file in a terminal. You'll find that by default you have read and write permissions, but not execute permissions. Give yourself execute permissions using one of these two commands<syntaxhighlight lang="bash">
chmod a+x hello.sh # Add execute permissions for everyone, or:
chmod 755 hello.sh # Set permissions to exactly this


</syntaxhighlight>
*Run the script by specifying an absolute path, a relative-to-home path, or if your PWD is right: you can use the '''.''' special character as a shortcut:<syntaxhighlight lang="bash">
/home/yourusername/lab9/hello.sh # Run first.sh using an absolute path
~/lab9/hello.sh # Run first.sh relative-to-home path
./hello.sh # Run first.sh from the PWD
</syntaxhighlight>
Your script doesn't do anything yet: but if you get any errors: the script might be in the wrong place, or it might have the wrong permisions, or you're not executing it correctly.


|}
== echo ==
echo is an interesting command. Initially it may appear to do nothing of use, but with better understanding of how programming works it will make a lot more sense.


= KEY CONCEPTS =
The echo command '''prints some output''' (via STDOUT) - whatever output you tell it to print.


For example: if you run this in a terminal:<syntaxhighlight lang="bash">
echo Hello
</syntaxhighlight>it will print the word Hello.


===Using the sed Utility===
If you give echo multiple arguments: it will print them all, with one space in between each of them:<syntaxhighlight lang="bash">
echo Hello    there # Note the multiple spaces
</syntaxhighlight>Remember that in bash the arguments for a command are whitespace-separated. So it doesn't matter how many spaces you put between arguments to echo: they are still interpreted as separate arguments.


If you want to include multiple words and all the spacing between them in echo's output: combine them all into a single argument by enclosing the entire string of text into quotes:<syntaxhighlight lang="bash">
echo "Hello    there" # Note the multiple spaces
</syntaxhighlight>The echo command can be used for more complicated things, but this is all we need for this lab.


'''Usage:'''
* Add a line to your shell script so that when you run the script: it will print: "Hello. I will now do a bunch of stuff". It should look like this when you run it:


'''<span style="color:blue;font-weight:bold;font-family:courier;">Syntax:  sed [-n] 'address instruction' filename</span>'''  
[[File:Hello.sh-1.png|center]]One of the main reasons shell scripts are exceptionally useful is that once you get your script to work: you don't need to worry about typos, command syntax, or even remembering how exactly the commands work.


The other big reason is: you don't have to retype your commands every time you want to run them.


'''How it Works:'''
This simple echo program is a great example of that. You can already see that typing the command to run the script is much shorter than the one echo line inside the script. Obviously the longer the script: the greater the probability you will make mistakes, and the more you'd need to type when you wanted those commands executed.


* The sed command reads all lines in the input file and will be exposed to the expression<br>(i.e. area contained within quotes) one line at a time.
== date ==
* The expression can be within single quotes or double quotes.
Another very simple command is date.  
* The expression contains an address (match condition) and an instruction (operation).
* If the line matches the address, then it will perform the instruction.
* Lines will display be default unless the '''–n''' option is used to suppress default display
<br>
'''Address:'''


* Can use a line number, to select a specific line (for example: '''5''')
* Run '''date''' in a terminal. It will print the current date and time.
* Can specify a range of line numbers (for example: '''5,7''')
* Regular expressions are contained within forward slashes (e.g. /regular-expression/)
* Can specify a regular expression to select all lines that match a pattern  (e.g '''/^[0-9].*[0-9]$/''')
* If NO address is present, the instruction will apply to ALL lines


Using arguments you can change the format of the output, add or remove parts that you want or don't want to see. We won't need to do that but if you're curious: you can look at the man page for date, under the FORMAT section.


[[Image:sed.png|right|500px|]]
* Modify your script so that its output looks like this (except with the current date, don't hard-code the 25th of march in your script):
'''Instruction:'''
*'''Action''' to take for matched line(s)
*Refer to table on right-side for list of some<br>'''common instructions''' and their purpose
<br><br>


===Using the awk Utility===
[[File:Hello.sh-2.png|center]]


'''Usage:'''
== Re-running a script ==
A script is usually meant to execute unattended. Ideally your script will account for the most common variability in the environment.


<span style="color:blue;font-weight:bold;font-family:courier;">awk [-F] 'selection-criteria {action}’ file-name</span>
For example: if you mean to create a directory in a script, and you run the script a second time: the script should not cause errors the second time because the directory was already created the first time. You should decide when you build the script how to deal with this possibility.


As an example: let's start with a script like this:<syntaxhighlight lang="bash">
#!/bin/bash


'''How It Works:'''
mkdir ~/lab9/temp/


* The '''awk''' command reads all lines in the input file and will be exposed to the expression (contained within quotes) for processing.
</syntaxhighlight>
*The '''expression''' (contained in quotes) represents '''selection criteria''',  and '''action''' to execute contained within braces '''{}'''
* if selection criteria is matched, then action (between braces) is executed.
* The '''–F''' option can be used to specify the default '''field delimiter''' (separator) character<br>eg. '''awk –F”;”'''  (would indicate a semi-colon delimited input file).
<br>
'''Selection Criteria'''


* You can use a regular expression, enclosed within slashes, as a pattern. For example: '''/pattern/'''
* Save that code as '''~/lab9/lab9.sh''' and give it execute permissions.
* The ~ operator tests whether a field or variable matches a regular expression. For example:  '''$1 ~ /^[0-9]/'''
* Run lab9.sh twice.
* The '''!~''' operator tests for no match. For example: '''$2 !~ /line/'''
* Note that the first time you run it: it creates the temp directory. The second time you run it: you get an error.
* You can perform both numeric and string comparisons using relational operators ( '''>''' , '''>=''' , '''<''' , '''<=''' , '''==''' , '''!=''' ).
* You can combine any of the patterns using the Boolean operators '''||''' (OR) and '''&&''' (AND).
* You can use built-in variables (like NR or "record number" representing line number) with comparison operators.<br>For example: '''NR >=1 && NR <= 5'''
<br>
'''Action (execution):'''


* Action to be executed is contained within braces '''{}'''
Depending on the circumstances you might want to ignore the error, or make sure the directory is deleted before you attempt to create it, or you might actually want to get an error.
* The '''print''' command can be used to display text (fields).
* You can use parameters which represent fields within records (lines) within the expression of the awk utility.
* The parameter '''$0''' represents all of the fields contained in the record (line).
* The parameters '''$1''', '''$2''', '''$3''' … '''$9''' represent the first, second and third  to the 9th fields contained within the record.
* Parameters greater than nine requires the value of the parameter to be placed within braces (for example:  '''${10}''','''${11}''','''${12}''', etc.)
* You can use built-in '''variables''' (such as '''NR''' or "record number" representing line number)<br>eg. '''{print NR,$0}'''  (will print record number, then entire record).


=INVESTIGATION 1: USING THE SED UTILITY=
* In our case let's say we want a brand new, empty, temp directory created every time the script runs. To ensure that happens: delete the directory before you create it:<syntaxhighlight lang="bash">
#!/bin/bash


<span style="color:red;">'''ATTENTION''': This online tutorial will be required to be completed by '''Friday in week 11 by midnight''' to obtain a grade of '''2%''' towards this course</span><br><br>
rm -r ~/lab9/temp/
mkdir ~/lab9/temp/


In this investigation, you will learn how to manipulate text using the '''sed''' utility.
</syntaxhighlight>
* Now you can run the script as many times as you want, and every time you will end up an empty temp directory.


This script still has some potential problems, but all of them are dealt with in the same way:


'''Perform the Following Steps:'''
# Anticipate what could go wrong
# Structure your code to do something you want under different circumstances


# '''Login''' to your matrix account and confirm you are located in your '''home''' directory.<br><br>
= lab9-dolab3.sh =
# Issue a Linux command to create a directory called '''sed'''<br><br>
We're going to write a script which will do all the command-line work you've done in lab 3, starting with the [[OPS145 Lab 3#Create directories with mkdir|mkdir section]].
# Issue a Linux command to <u>change</u> to the '''sed''' directory and confirm that you are located in the '''sed''' directory.<br><br>
# Issue the following Linux command to copy the data.txt file<br>('''copy and paste''' to save time):<br><span style="color:blue;font-weight:bold;font-family:courier;">cp ~uli101/tutorialfiles/data.txt ~/sed</span><br><br>
# Issue the '''more''' command to quickly view the contents of the '''data.txt''' file.<br>When finished, exit the more command by pressing the letter <span style="color:blue;font-weight:bold;font-family:courier;">q</span>[[Image:sed-1.png|thumb|right|300px|Issuing the '''p''' instruction without using the '''-n''' option (to suppress original output) will display lines twice.]]<br><br>The '''p''' instruction with the '''sed''' command is used to<br>'''print''' (i.e. ''display'') the contents of a text file.<br><br>
# Issue the following Linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed 'p' data.txt</span><br><br>'''NOTE: You should notice that each line appears twice'''.<br><br>The reason why standard output appears twice is that the sed command<br>(without the '''-n option''') displays all lines regardless of an address used.<br><br>We will use '''pipeline commands''' to both display stdout to the screen and save to files<br>for <u>confirmation</u> of running these pipeline commands when run a '''checking-script''' later in this investigation.<br><br>
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n 'p' data.txt | tee sed-1.txt</span><br><br>What do you notice? You should see only one line.<br><br>You can specify an '''address''' to display lines using the sed utility<br>(eg. ''line #'', '''line #s''' or range of '''line #s''').<br><br>
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n '1 p' data.txt | tee sed-2.txt</span><br><br>You should see the first line of the text file displayed.<br>What other command is used to only display the first line in a file?<br><br>[[Image:sed-2.png|thumb|right|500px|Using the sed command to display a '''range''' of lines.]]
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n '2,5 p' data.txt | tee sed-3.txt</span><br><br>What is displayed? How would you modify the sed command to display the line range 10 to 50?<br><br>The '''s''' instruction is used to '''substitute''' text<br>(a similar to method was demonstrated in the vi editor in tutorial 9).<br><br>
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed '2,5 s/TUTORIAL/LESSON/g' data.txt | tee sed-4.txt | more</span><br><br>What do you notice? View the original contents of lines 2 to 5 in the '''data.txt''' file<br>in another shell to confirm that the substitution occurred.<br><br>[[Image:sed-3.png|thumb|right|500px|Using the sed command with the '''-q''' option to display up to a line number, then quit.]]The '''q''' instruction terminates or '''quits''' the execution of the sed utility as soon as it is read in a particular line or matching pattern.<br><br>
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed '11 q' data.txt | tee sed-5.txt</span><br><br>What did you notice? How many lines were displayed<br>before the sed command exited?<br><br>You can use '''regular expressions''' to select lines that match a pattern. In fact,<br>the sed command was one of the <u>first</u> Linux commands that used regular expression.<br><br>The rules remain the same for using regular expressions as demonstrated in '''tutorial 9'''<br>except the regular expression must be contained within '''forward slashes'''<br>(eg. <span style="font-family:courier;font-weight:bold;">/regexp/</span> ).<br><br>[[Image:sed-4.png|thumb|right|400px|Using the sed command using regular expressions with '''anchors'''.]]
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n '/^The/ p' data.txt | tee sed-6.txt</span><br><br>What do you notice?<br><br>
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">sed -n '/d$/ p' data.txt | tee sed-7.txt</span><br><br>What do you notice?<br><br>The '''sed''' utility can also be used as a '''filter''' to manipulate text that<br>was generated from Linux commands.<br><br>[[Image:sed-5.png|thumb|right|400px|Using the sed command with '''pipeline''' commands.]]
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">who | sed -n '/^[a-m]/ p' | tee sed-8.txt | more</span><br><br>What did you notice?<br><br>
# Issue the following Linux pipeline command:<br><span style="color:blue;font-weight:bold;font-family:courier;">ls | sed -n '/txt$/ p' | tee sed-9.txt</span><br><br>What did you notice?<br><br>
# Issue the following to run a checking script:<br><span style="color:blue;font-weight:bold;font-family:courier;">~uli101/week10-check-1</span><br><br>If you encounter errors, make corrections and '''re-run''' the checking script<br>until you receive a congratulations message, then you can proceed.<br><br>


:In the next investigation, you will learn how to manipulate text using the '''awk''' utility.<br><br>
* Create '''~/lab9/lab9-dolab3.sh'''
* Your script has a PWD, no different than the interactive shell you've been using this whole time. If you want to run '''mkdir lab3-cmd''' and have that create lab3-cmd in your home directory: you need to change your PWD first:<syntaxhighlight lang="bash">
#!/bin/bash


=INVESTIGATION 2: USING THE AWK UTILITY =
echo "My pwd is now: "
pwd


In this investigation, you will learn how to use the awk utility to manipulate text and generate reports.
cd ~
echo "My pwd is now: "
pwd


'''Perform the Following Steps:'''
</syntaxhighlight>
* Note that even though inside the script the pwd was your home directory: once the script exits your PWD is set back to what it was before the script executed. That's because a new bash process is started when you run your script, and that new process exits when the script is done executing:


# Change to your '''home''' directory and issue a command to '''confirm'''<br>you are located in your ''home'' directory.<br><br>
[[File:PwdInScript.png|center]]
# Issue a Linux command to create a directory called '''awk'''<br><br>
# Issue a Linux command to <u>change</u> to the '''awk''' directory and confirm you are located in the '''awk''' directory.<br><br>Let's download a database file that contains information regarding classic cars.<br><br>
# Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">cp ~uli101/tutorialfiles/cars.txt ~/awk</span><br><br>
# Issue the '''cat''' command to quickly view the contents of the '''cars.txt''' file.<br><br>The "'''print'''" action (command) is the <u>default</u> action of awk to print<br>all selected lines that match a '''pattern'''.<br><br>This '''action''' (contained in braces) can provide more options<br>such as printing '''specific fields''' of selected lines (or records) from a database.<br><br>[[Image:awk-1.png|thumb|right|400px|Using the awk command to display matches of the pattern '''ford'''.]]
# Issue the following linux command all to display all lines (i.e. records) in the '''cars.txt''' database that matches the pattern (or "make") called '''ford''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/ford/ {print}' cars.txt</span><br><br>We will use '''pipeline commands''' to both display stdout to the screen and save to files for <u>confirmation</u> of running these pipeline commands when run a '''checking-script''' later in this investigation.<br><br>
# Issue the following linux pipeline command all to display records<br>in the '''cars.txt''' database that contain the pattern (i.e. make) '''ford''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/ford/' cars.txt | tee awk-1.txt</span><br><br>What do you notice? You should notice ALL lines displayed <u>without</u> using '''search criteria'''.<br><br>You can use ''builtin'' '''variables''' with the '''print''' command for further processing.<br>We will discuss the following variables in this tutorial:<br><br>[[Image:awk-2.png|thumb|right|400px|Using the awk command to print search results by '''field number'''.]]'''$0''' - Current record (entire line)<br>'''$1''' - First field in record<br>'''$n''' - nth field in record<br>'''NR''' - Record Number (order in database)<br> '''NF''' - Number of fields in current record<br><br>For a listing of more variables, please consult your course notes.<br><br>
# Issue the following linux pipeline command to display the '''model''', '''year''', '''quantity''' and price<br>in the '''cars.txt''' database for makes of '''chevy''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '/chevy/ {print $2,$3,$4,$5}' cars.txt | tee awk-2.txt</span><br><br>Notice that a '''space''' is the delimiter for the fields that appear as standard output.<br><br>The '''tilde character''' '''~''' is used to search for a pattern or display standard output for a particular field.<br><br>
# Issue the following linux pipeline command to display all '''plymouths''' ('''plym''')<br>by '''model name''', '''price''' and '''quantity''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /plym/ {print $2,$3,$4,$5}' cars.txt | tee awk-3.txt</span><br><br>You can also use '''comparison operators''' to specify conditions for processing with matched patterns<br>when using the awk command. Since they are used WITHIN the awk expression,<br>they are not confused with redirection symbols<br><br>[[Image:awk-3.png|thumb|right|400px|Using the awk command to display results based on '''comparison operators'''.]]'''<''' &nbsp;&nbsp;&nbsp;&nbsp;Less than<br>'''<=''' &nbsp;&nbsp;Less than or equal<br>'''>''' &nbsp;&nbsp;&nbsp;&nbsp;Greater than<br>'''>=''' &nbsp;&nbsp;Greater than or equal<br>'''==''' &nbsp;&nbsp;Equal<br>'''!=''' &nbsp;&nbsp;&nbsp;Not equal<br><br>
# Issue the following linux pipeline command to display display the '''car make''', '''model''', '''quantity''' and '''price''' of all vehicles whose '''prices are less than $5,000''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$5 < 5000 {print $1,$2,$4,$5}' cars.txt | tee awk-4.txt</span><br><br>What do you notice?<br><br>
# Issue the following linux pipeline command to display display '''price''',<br>'''quantity''', '''model''' and '''car make''' of vehicles whose '''prices are less than $5,000''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$5 < 5000 {print $5,$4,$2,$1}' cars.txt | tee awk-5.txt</span><br><br>
# Issue the following linux pipeline command to display the '''car make''',<br>'''year''' and '''quantity''' of cars that '''begin''' with the '''letter 'f'''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /^f/ {print $1,$2,$4}' cars.txt | tee awk-6.txt</span><br><br>[[Image:awk-4.png|thumb|right|400px|Using the awk command to display combined search results based on '''compound operators'''.]]Combined pattern searches can be made<br>by using '''compound operator''' symbols:<br><br>'''&&''' &nbsp;&nbsp;&nbsp;&nbsp;(and)<br>'''||''' &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(or)<br><br>
# Issue the following linux pipeline command to list all '''fords'''<br>whose '''price is greater than $10,000''':<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /ford/ && $5 > 10000 {print $0}' cars.txt | tee awk-7.txt</span><br><br>
# Issue the following linux command:<br><span style="color:blue;font-weight:bold;font-family:courier;">cp ~uli101/tutorialfiles/cars2.txt ~/awk</span><br><br>
# Issue the '''cat''' command to quickly view the contents of the '''cars2.txt''' file.<br><br>
# Issue the following linux pipeline command to display the '''year'''<br>and '''quantity''' of cars that '''begin''' with the '''letter 'f'''' for the '''cars2.txt''' database:<br><span style="color:blue;font-weight:bold;font-family:courier;">awk '$1 ~ /^f/ {print $2,$4}' cars2.txt | tee awk-8.txt</span><br><br>What did you notice?<br><br>The problem is that the '''cars2.txt''' database separates each field by a semi-colon (''';''') <u>instead</u> of '''TAB'''.<br>Therefore, it does not recognize the second and fourth fields.<br><br>You need to issue awk with the -F option to indicate that this file's fields are separated (delimited) by a semi-colorn.<br><br>
# Issue the following linux pipeline command to display the '''year'''<br>and '''quantity''' of cars that '''begin''' with the '''letter 'f'''' for the '''cars2.txt''' database:<br><span style="color:blue;font-weight:bold;font-family:courier;">awk -F";" '$1 ~ /^f/ {print $2,$4}' cars2.txt | tee awk-9.txt</span><br><br>What did you notice this time?<br><br>
# Issue the following to run a checking script:<br><span style="color:blue;font-weight:bold;font-family:courier;">~uli101/week10-check-2</span><br><br>If you encounter errors, make corrections and '''re-run''' the checking script until you<br>receive a congratulations message, then you can proceed.<br><br>


= LINUX PRACTICE QUESTIONS =
* Have a look at what's inside your ~/lab3-cmd now. If you finished lab3: you should have some subdirectories and files in there. Maybe make a backup copy of it, so you have something to compare your script's work to.
* We want this lab9-dolab3.sh script to do all the work, starting with creating lab3-cmd. So our script will delete lab3-cmd and all its contents before it does anything else. Try this:<syntaxhighlight lang="bash">
#!/bin/bash


The purpose of this section is to obtain '''extra practice''' to help with '''quizzes''', your '''midterm''', and your '''final exam'''.
echo "My pwd is now: "
pwd


Here is a link to the MS Word Document of ALL of the questions displayed below but with extra room to answer on the document to
cd ~
simulate a quiz:
echo "My pwd is now: "
pwd


https://wiki.cdot.senecacollege.ca/uli101/files/uli101_week11_practice.docx
rm -r lab3-cmd


Your instructor may take-up these questions during class. It is up to the student to attend classes in order to obtain the answers to the following questions. Your instructor will NOT provide these answers in any other form (eg. e-mail, etc).
</syntaxhighlight>
* Run the script twice. Note that the second time you get an error, because you are trying to delete a directory which is no longer there.
* You can fix that by adding the '''-f''' option to rm:<syntaxhighlight lang="bash">
#!/bin/bash


echo "My pwd is now: "
pwd


'''Review Questions:'''
cd ~
echo "My pwd is now: "
pwd


'''Part A: Display Results from Using the sed Utility'''
rm -r -f lab3-cmd


Note the contents from the following tab-delimited file called '''~murray.saul/uli101/stuff.txt''':
</syntaxhighlight>
(this file pathname exists for checking your work)
* You can use a terminal or a graphical file manager to check on the results of running your script.
* The next few instructions we can copy verbatim from lab 3:<syntaxhighlight lang="bash">
#!/bin/bash


<pre>
# Start in home directory
Line one.
cd ~
This is the second line.
This is the third.
This is line four.
Five.
Line six follows
Followed by 7
Now line 8
and line nine
Finally, line 10
</pre>


# Delete the lab3-cmd directory and its contents if it exists
rm -r -f lab3-cmd


Write the results of each of the following Linux commands for the above-mentioned file:
# This way you make sure lab3-cmd exists and is empty
mkdir lab3-cmd


# These commands are copied verbatime from lab3:
mkdir lab3-cmd/red
cd lab3-cmd/red
pwd


# <span style="font-family:courier;font-weight:bold">sed -n '3,6 p' ~murray.saul/uli101/stuff.txt</span><br><br>
mkdir ../green
# <span style="font-family:courier;font-weight:bold">sed '4 q' ~murray.saul/uli101/stuff.txt</span><br><br>
pwd
# <span style="font-family:courier;font-weight:bold">sed '/the/ d' ~murray.saul/uli101/stuff.txt</span><br><br>
ls
# <span style="font-family:courier;font-weight:bold">sed 's/line/NUMBER/g' ~murray.saul/uli101/stuff.txt</span>
ls ..


# Make sure to replace youruserid with your user id:
mkdir /home/youruserid/lab3-cmd/blue
ls ~/lab3-cmd


'''Part B: Writing Linux Commands Using the sed Utility'''
</syntaxhighlight>
* Note that when the script gets longer: it's helpful (for you) to have some comments in it explaining what you're intending the script to do.
* Also: you may get confused about which part of your script produces which output. In the latest we have it's not clear which is the output from '''ls''' and which from '''ls ..'''
* A common way to deal with this is to add some echos:<syntaxhighlight lang="bash">
#!/bin/bash


Write a single Linux command to perform the specified tasks for each of the following questions.
# Start in home directory
cd ~


# Delete the lab3-cmd directory and its contents if it exists
rm -r -f lab3-cmd


# Write a Linux sed command to display only lines 5 to 9 for the file: '''~murray.saul/uli101/stuff.txt'''<br><br>
# This way you make sure lab3-cmd exists and is empty
# Write a Linux sed command to display only lines the begin the pattern “and” for the file: '''~murray.saul/uli101/stuff.txt'''<br><br>
mkdir lab3-cmd
# Write a Linux sed command to display only lines that end with a digit for the file: '''~murray.saul/uli101/stuff.txt'''<br><br>
# Write a Linux sed command to save lines that match the pattern “line” (upper or lowercase) for the file: '''~murray.saul/uli101/stuff.txt''' and save results (overwriting previous contents) to: '''~/results.txt'''<br><br>


# These commands are copied verbatime from lab3:
mkdir lab3-cmd/red
cd lab3-cmd/red
pwd


'''Part C: Writing Linux Commands Using the awk Utility'''
mkdir ../green
pwd
echo "Contents of the directory above (red):"
ls
echo "Contents of the parent of that directory (~/lab3-cmd/):"
ls ..


Note the contents from the following tab-delimited file called '''~murray.saul/uli101/stuff.txt''':
# Make sure to replace youruserid with your user id:
(this file pathname exists for checking your work)
mkdir /home/youruserid/lab3-cmd/blue
echo "Contents of ~/lab3-cmd/:"
ls ~/lab3-cmd


<pre>
</syntaxhighlight>
Line one.
* Let's skip the [[OPS145 Lab 3#Removing empty directories with rmdir and rm -r|removing directories section]] of lab 3 since that doesn't leave anything for us to look at. We'll continue with the [[OPS145 Lab 3#Copying files and directories with cp and cp -r|cp section]].
This is the second line.
*Add the first two commands from that section to your script and run it:<syntaxhighlight lang="bash">
This is the third.
cd ~
This is line four.
cp Downloads/SampleFiles.tar.xz lab3-cmd/red
Five.
</syntaxhighlight>
Line six follows
*Then check that your script did what it was supposed to. There are multiple ways to do that:
Followed by 7
*#Add the '''-v''' argument to cp. That way it will print what it's copying into where.
Now line 8
*#Add a temporary ls command inside your script, to check that the new file has been copied to lab3-cmd/red successfully.
and line nine
*#Look in that directory outside your script.
Finally, line 10
When you write a long script: it's good practice to check that smaller parts of it work, rather than finish the whole script and then try to figure out at which point it didn't do what it was supposed to.
</pre>


* Add the rest of the commands from the cp section to your script: so you'll end up with the same contents as you did at the end of that section in lab 3 (there's a screenshot there).
* Finally: add commands to your script to also complete the [[OPS145 Lab 3#Moving and renaming files and directories with mv|moving section]] and [[OPS145 Lab 3#Deleting files|deleting section]] of lab 3.
Don't forget to run your script frequently to check that your latest modifications worked. Don't ignore errors: fix them before you move on.


'''Write the results of each of the following Linux commands for the above-mentioned file:'''
== What was the point? ==
You might be thinking: writing this script was not any easier than doing lab3 one command at a time. So what was the point? There are two:


# You needed an introduction to scripting, and with this one you were already familiar with the expected results.
# You don't write a script to execut it once. Imagine you had multiple machines, a hundred perhaps, and they all needed lab3 completed on them. Doing it one command at a time would take 100 times longer. Running the script 100 times would barely take any time at all.


# <span style="font-family:courier;font-weight:bold">awk ‘NR == 3 {print}’ ~murray.saul/uli101/stuff.txt</span><br><br>
== time ==
# <span style="font-family:courier;font-weight:bold">awk ‘NR >= 2 && NR <= 5 {print}’ ~murray.saul/uli101/stuff.txt</span><br><br>
There's a command in Linux called '''time'''. It doesn't give you the current time (the '''date''' command does that). The time command measures how long another command takes to run. For example, you can measure how long your new script takes to run:<syntaxhighlight lang="bash">
# <span style="font-family:courier;font-weight:bold">awk ‘$1 ~ /This/ {print $2}’ ~murray.saul/uli101/stuff.txt</span><br><br>
time ./lab9-dolab3.sh
# <span style="font-family:courier;font-weight:bold">awk ‘$1 ~ /This/ {print $3,$2}’ ~murray.saul/uli101/stuff.txt</span><br><br>
</syntaxhighlight>You can ignore the user/sys times. The "real" time is time as we humans experience it.
 
 
'''Part D: Writing Linux Commands Using the awk Utility'''
 
 
Write a single Linux command to perform the specified tasks for each of the following questions.
 
 
# Write a Linux awk command to display all records for the file: '''~/cars''' whose fifth field is greater than 10000.<br><br>
# Write a Linux awk command to display the first and fourth fields for the file: '''~/cars''' whose fifth field begins with a number.<br><br>
# Write a Linux awk command to display the second and third fields for the file: '''~/cars''' for records that match the pattern “chevy”.<br><br>
# Write a Linux awk command to display the first and second fields for all the records contained in the file: '''~/cars'''<br><br>
 
 
_________________________________________________________________________________
 
Author:  Murray Saul
 
License: LGPL version 3
Link:    https://www.gnu.org/licenses/lgpl.html
 
_________________________________________________________________________________
 


=Submit evidence of your work=


After you finish the lab: run the following commands to submit your work:<syntaxhighlight lang="bash">
cd ~
wget http://ops345.ca/check/ops145-lab9-check.sh # Download the check script
chmod 700 ops145-lab9-check.sh # Make the downloaded file executable
./ops145-lab9-check.sh # Run the check script
</syntaxhighlight>If it says "Your lab9 has been submitted": make a screenshot, and you're done. If it gives you any warnings or errors: you have to fix them and try the ./ops145-lab7-check.sh command again.


[[Category:OPS145]]
[[Category:OPS145]]

Latest revision as of 11:58, 27 March 2024

Bash scripting

Bash is the shell you've been using in this course to run Linux commands. Bash is also a programming language. It's a special purpose programming language, not something you would write a graphical application in. What you do in a bash script is essentially the exact equivalent of what you would run at a terminal prompt - except you can run all your commands at once, instead of one at a time.

A bash script is a plain text file. The bash programming language is interpreted (as opposed to compiled) - meaning the code you write doesn't need to be compiled before you can execute it.

Setup

The setup for writing a bash script is minimal. You'll need to:

  1. Create the script in a plain text editor: either graphical or on the command line. Usually you save it with a .sh extension, though technically you don't have to.
  2. Make sure you (and anyone else you want to allow to execute the script) have read and execute permissions for the file.
  3. Add a shebang line at the top.

Permissions

Since bash scripts are interpreted (rather than executed outright): you can't actually execute a bash file. In order to execute the commands in a bash script: they need to be read, and interpreted, and executed, by the bash program.

That's why just giving a script execute permissions may not be enough to run it. You need to give yourself read permission, so that the bash program can read your script and execute it.

In fact you don't even need execute permissions to run a bash script. You can run bash, and give it the name of the script as the first argument.

Shebang line

Because bash scripts are interpreted, and extensions are mostly ignored in Linux: the shell you're using to execute your script needs to know what kind of script it is. There are many interpreted programming languages. If you don't make it clear what language your script is written in: there's a chance it will be misinterpreted.

A shebang line for a bash script looks like this:

#!/bin/bash

It has the be the first line in your script.

Anything following this line is regular bash.

hello.sh

Decide for yourself whether you can handle the bash learning in this lab while using vi. If you feel that's too hard: you can use a graphical text editor.

  • Create the lab9 directory inside your home directory.
  • Open a text editor and save an empty file into ~/lab9/hello.sh
  • Add the shebang line at the top
  • Look at the permissions for your file in a terminal. You'll find that by default you have read and write permissions, but not execute permissions. Give yourself execute permissions using one of these two commands
    chmod a+x hello.sh # Add execute permissions for everyone, or:
    chmod 755 hello.sh # Set permissions to exactly this
    
  • Run the script by specifying an absolute path, a relative-to-home path, or if your PWD is right: you can use the . special character as a shortcut:
    /home/yourusername/lab9/hello.sh # Run first.sh using an absolute path
    ~/lab9/hello.sh # Run first.sh relative-to-home path
    ./hello.sh # Run first.sh from the PWD
    

Your script doesn't do anything yet: but if you get any errors: the script might be in the wrong place, or it might have the wrong permisions, or you're not executing it correctly.

echo

echo is an interesting command. Initially it may appear to do nothing of use, but with better understanding of how programming works it will make a lot more sense.

The echo command prints some output (via STDOUT) - whatever output you tell it to print.

For example: if you run this in a terminal:

echo Hello

it will print the word Hello. If you give echo multiple arguments: it will print them all, with one space in between each of them:

echo Hello    there # Note the multiple spaces

Remember that in bash the arguments for a command are whitespace-separated. So it doesn't matter how many spaces you put between arguments to echo: they are still interpreted as separate arguments. If you want to include multiple words and all the spacing between them in echo's output: combine them all into a single argument by enclosing the entire string of text into quotes:

echo "Hello    there" # Note the multiple spaces

The echo command can be used for more complicated things, but this is all we need for this lab.

  • Add a line to your shell script so that when you run the script: it will print: "Hello. I will now do a bunch of stuff". It should look like this when you run it:
Hello.sh-1.png

One of the main reasons shell scripts are exceptionally useful is that once you get your script to work: you don't need to worry about typos, command syntax, or even remembering how exactly the commands work.

The other big reason is: you don't have to retype your commands every time you want to run them.

This simple echo program is a great example of that. You can already see that typing the command to run the script is much shorter than the one echo line inside the script. Obviously the longer the script: the greater the probability you will make mistakes, and the more you'd need to type when you wanted those commands executed.

date

Another very simple command is date.

  • Run date in a terminal. It will print the current date and time.

Using arguments you can change the format of the output, add or remove parts that you want or don't want to see. We won't need to do that but if you're curious: you can look at the man page for date, under the FORMAT section.

  • Modify your script so that its output looks like this (except with the current date, don't hard-code the 25th of march in your script):
Hello.sh-2.png

Re-running a script

A script is usually meant to execute unattended. Ideally your script will account for the most common variability in the environment.

For example: if you mean to create a directory in a script, and you run the script a second time: the script should not cause errors the second time because the directory was already created the first time. You should decide when you build the script how to deal with this possibility.

As an example: let's start with a script like this:

#!/bin/bash

mkdir ~/lab9/temp/
  • Save that code as ~/lab9/lab9.sh and give it execute permissions.
  • Run lab9.sh twice.
  • Note that the first time you run it: it creates the temp directory. The second time you run it: you get an error.

Depending on the circumstances you might want to ignore the error, or make sure the directory is deleted before you attempt to create it, or you might actually want to get an error.

  • In our case let's say we want a brand new, empty, temp directory created every time the script runs. To ensure that happens: delete the directory before you create it:
    #!/bin/bash
    
    rm -r ~/lab9/temp/
    mkdir ~/lab9/temp/
    
  • Now you can run the script as many times as you want, and every time you will end up an empty temp directory.

This script still has some potential problems, but all of them are dealt with in the same way:

  1. Anticipate what could go wrong
  2. Structure your code to do something you want under different circumstances

lab9-dolab3.sh

We're going to write a script which will do all the command-line work you've done in lab 3, starting with the mkdir section.

  • Create ~/lab9/lab9-dolab3.sh
  • Your script has a PWD, no different than the interactive shell you've been using this whole time. If you want to run mkdir lab3-cmd and have that create lab3-cmd in your home directory: you need to change your PWD first:
    #!/bin/bash
    
    echo "My pwd is now: "
    pwd
    
    cd ~
    echo "My pwd is now: "
    pwd
    
  • Note that even though inside the script the pwd was your home directory: once the script exits your PWD is set back to what it was before the script executed. That's because a new bash process is started when you run your script, and that new process exits when the script is done executing:
PwdInScript.png
  • Have a look at what's inside your ~/lab3-cmd now. If you finished lab3: you should have some subdirectories and files in there. Maybe make a backup copy of it, so you have something to compare your script's work to.
  • We want this lab9-dolab3.sh script to do all the work, starting with creating lab3-cmd. So our script will delete lab3-cmd and all its contents before it does anything else. Try this:
    #!/bin/bash
    
    echo "My pwd is now: "
    pwd
    
    cd ~
    echo "My pwd is now: "
    pwd
    
    rm -r lab3-cmd
    
  • Run the script twice. Note that the second time you get an error, because you are trying to delete a directory which is no longer there.
  • You can fix that by adding the -f option to rm:
    #!/bin/bash
    
    echo "My pwd is now: "
    pwd
    
    cd ~
    echo "My pwd is now: "
    pwd
    
    rm -r -f lab3-cmd
    
  • You can use a terminal or a graphical file manager to check on the results of running your script.
  • The next few instructions we can copy verbatim from lab 3:
    #!/bin/bash
    
    # Start in home directory
    cd ~
    
    # Delete the lab3-cmd directory and its contents if it exists
    rm -r -f lab3-cmd
    
    # This way you make sure lab3-cmd exists and is empty
    mkdir lab3-cmd
    
    # These commands are copied verbatime from lab3:
    mkdir lab3-cmd/red
    cd lab3-cmd/red
    pwd
    
    mkdir ../green
    pwd
    ls
    ls ..
    
    # Make sure to replace youruserid with your user id:
    mkdir /home/youruserid/lab3-cmd/blue
    ls ~/lab3-cmd
    
  • Note that when the script gets longer: it's helpful (for you) to have some comments in it explaining what you're intending the script to do.
  • Also: you may get confused about which part of your script produces which output. In the latest we have it's not clear which is the output from ls and which from ls ..
  • A common way to deal with this is to add some echos:
    #!/bin/bash
    
    # Start in home directory
    cd ~
    
    # Delete the lab3-cmd directory and its contents if it exists
    rm -r -f lab3-cmd
    
    # This way you make sure lab3-cmd exists and is empty
    mkdir lab3-cmd
    
    # These commands are copied verbatime from lab3:
    mkdir lab3-cmd/red
    cd lab3-cmd/red
    pwd
    
    mkdir ../green
    pwd
    echo "Contents of the directory above (red):"
    ls
    echo "Contents of the parent of that directory (~/lab3-cmd/):"
    ls ..
    
    # Make sure to replace youruserid with your user id:
    mkdir /home/youruserid/lab3-cmd/blue
    echo "Contents of ~/lab3-cmd/:"
    ls ~/lab3-cmd
    
  • Let's skip the removing directories section of lab 3 since that doesn't leave anything for us to look at. We'll continue with the cp section.
  • Add the first two commands from that section to your script and run it:
    cd ~
    cp Downloads/SampleFiles.tar.xz lab3-cmd/red
    
  • Then check that your script did what it was supposed to. There are multiple ways to do that:
    1. Add the -v argument to cp. That way it will print what it's copying into where.
    2. Add a temporary ls command inside your script, to check that the new file has been copied to lab3-cmd/red successfully.
    3. Look in that directory outside your script.

When you write a long script: it's good practice to check that smaller parts of it work, rather than finish the whole script and then try to figure out at which point it didn't do what it was supposed to.

  • Add the rest of the commands from the cp section to your script: so you'll end up with the same contents as you did at the end of that section in lab 3 (there's a screenshot there).
  • Finally: add commands to your script to also complete the moving section and deleting section of lab 3.

Don't forget to run your script frequently to check that your latest modifications worked. Don't ignore errors: fix them before you move on.

What was the point?

You might be thinking: writing this script was not any easier than doing lab3 one command at a time. So what was the point? There are two:

  1. You needed an introduction to scripting, and with this one you were already familiar with the expected results.
  2. You don't write a script to execut it once. Imagine you had multiple machines, a hundred perhaps, and they all needed lab3 completed on them. Doing it one command at a time would take 100 times longer. Running the script 100 times would barely take any time at all.

time

There's a command in Linux called time. It doesn't give you the current time (the date command does that). The time command measures how long another command takes to run. For example, you can measure how long your new script takes to run:

time ./lab9-dolab3.sh

You can ignore the user/sys times. The "real" time is time as we humans experience it.

Submit evidence of your work

After you finish the lab: run the following commands to submit your work:

cd ~
wget http://ops345.ca/check/ops145-lab9-check.sh # Download the check script
chmod 700 ops145-lab9-check.sh # Make the downloaded file executable
./ops145-lab9-check.sh # Run the check script

If it says "Your lab9 has been submitted": make a screenshot, and you're done. If it gives you any warnings or errors: you have to fix them and try the ./ops145-lab7-check.sh command again.