OPS145 Lab 6 Newversion: Difference between revisions

From Littlesvr Wiki
Jump to navigation Jump to search
Line 67: Line 67:
tar xvf SpecialChars.tar.xz # This will extract the contents of SpecialChars.tar.xz into the PWD
tar xvf SpecialChars.tar.xz # This will extract the contents of SpecialChars.tar.xz into the PWD
</syntaxhighlight>
</syntaxhighlight>
In the block of commands above there are words which are clearly not intended to be executed by the shell (it's only so smart). They are for human beings to read. But you can run all the above without any difficulty because anything following the special character '''#''' is treated as a comment, meaning the shell ignores it no matter what it is.
In the block of commands above there are words which are clearly not intended to be executed by the shell (it's only so smart). They are for human beings to read. But you can run all the above without any difficulty because anything following the special character '''#''' is treated as a comment, meaning the shell ignores it.
 
That should make you wonder: what if you want to use a '''#''' in a filename, or as an argument to a command: we'll get to that later in the lab.


The '''#''' is usually called a pound sign, or a number sign, or a hash.
The '''#''' is usually called a pound sign, or a number sign, or a hash.

Revision as of 09:48, 28 February 2024

Special characters in the shell

The terminal you've been using this whole time in the course is running an application called the shell. Specifically the bash shell.

This application (bash) has been written to interpret input from the user as some sort of a command; or as an argument; or as data. It's a very powerful application with many abilities, but almost all the user input comes from the keyboard.

There are only so many keys of the keyboard, and most of them are used to input human-readable characters. That means some of those characters will necessarily have multiple meanings. We will look at several such characters in this lab.

Here's most of the ASCII table (as much as I could fit on my screen). Note that most of the characters in this table are familiar to you, but some (especially the first 32) are probably brand new to you. You can see this table easily by running man ascii:

LongerManAscii.png

. and ..

Coming from the Windows world you might be used to the concepts of "filename" and "extension". If you're not too familiar with that: you should configure your windows to always show file extensions (it doesn't by default).

For example if you have a picture file called smily.jpg: You might call smily the filename, and jpg the extension. But then what is the dot? You could call it a separator, but In fact the dot is just as much part of the filename as smily and jpg. Looking at the ASCII table: s and m are just as different from each other as s and .

And what's the extension in a file called lab1.tar.xz? Is it xz? Or tar.xz? What about in a file called Dr. Evil plan.txt?

When it comes to filenames and extensions: the use of a dot as a separator is just a convention to help people organize their files, and is barely useful from a technological point of view.

But there's another use of dots in filesystems, which is much more fundamental. Every directory contains at least these two records: a link to itself (.), and a link to its parent (..)

You can see these records when you run ls with the -a argument. Let's look at the SampleFiles directory you downloaded back in Lab 2:

  • Open a terminal, and change the PWD to ~/Downloads/SampleFiles
  • Run ls -a Note that in the output there is a . and a ..
  • Run ls -a -l Note that the . and .. are directories.
Ls-al.png
  • Remember that you can give an argument to ls telling it to show the details of a specific file or directory. For example ls -l 1984.txt will show details about that specific file, and ls -l / will show details for the immediate contents of the root directory.
  • Run ls -l -a . (including the dot in the end) and note that it shows the same output as ls -l -a That's because:
    • ls will show the contents of PWD if you don't give it a specific path.
    • Your PWD is ~/Downloads/SampleFiles/
    • The . in SampleFiles points to itself (SampleFiles)
  • Run ls -l -a .. and ls -l -a ~/Downloads Note that the output is the same. That's because:
    • Your PWD is ~/Downloads/SampleFiles/
    • The parent directory of ~/Downloads/SampleFiles/ is ~/Downloads/
    • Inside ~/Downloads/SampleFiles/ .. and ~/Downloads are the same thing
  • The same for every other directory. The only sort-of exception is the root directory. Because it's the root and it doesn't have a parent: its parent is itself:
FilesystemIntroWithDots.png
  • Take some time to look at these many lines and understand that actually it's not that complicated. You just have to remember: dot is for itself, dot-dot is for its parent.

Hidden files

Dots have another (mostly unrelated) special use when it comes to filesystems in the shell. Any file/directory name which begins with a dot is considered hidden.

This is definitely not a security feature! It is purely a convenience.

  • Change your PWD to your home directory.
  • Run ls
  • Run ls -a

Notice there are many more files and directories in your home than you saw before. These were always there, the ls command doesn't show them by default. The -a argument tells ls to also show hidden files.

These are hidden to avoid clutter, to make it easier for you to find things you want. There's nothing special about these files and directories. They're just rarely manipulated by users.

This is also where the "extension" idea breaks down completely. If you insist that a file's extension is everything after the dot and the file's name is everything before the dot: all these hidden files have no names, and only extensions. That's clearly not the case. Almost all these files are plain text files, without a .txt extension.

  • Use the cat command to look at the contents of .bash_history. This file is where the commands you ran previously are stored, so you can retrieve them using the arrow keys.
  • Since you're at it: here's a new command. Run history on the terminal. It will show you all the previous commands you've run.

Comments

  • Now that you're more comfortable with the command-line: download and extract the next tarball using commands instead of Firefox and a graphical archive manager:
    cd ~/Downloads # To make sure it's downloaded here
    ls -l -h # To confirm you didn't already download SpecialChars.tar.xz
    wget http://ops345.ca/ops145/SpecialChars.tar.xz # This will download SpecialChars.tar.xz into the PWD
    tar xvf SpecialChars.tar.xz # This will extract the contents of SpecialChars.tar.xz into the PWD
    

In the block of commands above there are words which are clearly not intended to be executed by the shell (it's only so smart). They are for human beings to read. But you can run all the above without any difficulty because anything following the special character # is treated as a comment, meaning the shell ignores it.

The # is usually called a pound sign, or a number sign, or a hash.

The * wildcard

You've used the * wildcard (called a star, it's the multiplication sign on your keyboard) in lab 3 to delete several files at the same time. Now we'll look at it in more detail.

Sometimes you want to do something will all the contents of a directory. At other times you might want to do something with all your .jpg files, or all the .txt files, or all the files which have backup in their name. Those are examples of when you use a *

Inside the newly created ~/Downloads/SpecialChars/ directory is a SortMe subdirectory, which is a disorganized mess. We'll organize it in this lab. Remember that the whole point is to learn wildcards, so use commands in a terminal to move files around instead of dragging them in the graphical interface.

  • Use the graphical file manager to look at the SortMe directory to get an idea of what's in there. Note that:
    • There are pictures and sounds
    • Some of the pictures are small squares, and others are large photos
    • Some of the sounds are .wav files and others are .oga files
    • Some of the .wav files sound like speaker test sounds
  • We're going to organize these into the following directory tree inside ~/lab6:
SortMeTree.png
  • Remember that filenames are case-sensitive. Open a terminal and create the directory tree above using as many commands as you like. But here's a handy argument you can give to the mkdir command: -p
    mkdir -p ~/lab6/sounds/WAV/SpeakerTest
    
    When run with -p, the mkdir command will create not only the last directory on the path you give it, but all the other missing directories in that path as well. You might want to avoid using this if you're not very comfortable with the command-line yet, since it's very easy to create a hundred directories in the wrong place by giving mkdir -p the wrong path several times.
  • Change your PWD to ~/Downloads/SpecialChars/SortMe

When the bash shell encounters a * character on the command-line: it assumes you're trying to use it as a wildcard for filenames, and does its best to replace the * with a list of all filenames which match the pattern.

On its own: a * matches any number of any characters. We can test this easily using the ls command:

ls *

This appears to print the same output as ls on its own, but in fact the ls command received from the shell a list of all the files in ~/Downloads/SpecialChars/SortMe as arguments. You can combine the * wildcard with other characters, restricting what it will be expanded to. For example:

ls *jpg

This time * still expands to anu number of any characters, but because it's combined with an explicit jpg in the end: the result will be a list of all the files which have jpg at the end of their name.

  • Try running commands similar to the above with different extensions.

The wildcard doesn't have to be at the beginning. It can be in the end. For example this is how you would get a list of all the files which have names starting with amanea:

ls amanea*

The wildcard can be in the middle too. For example this is how you would get a list of all the files with _ somewhere in the name:

ls *_*

Or a list of filenames which contain an _ and end with wav:

ls *_*wav

The use of wildcards is completely unrelated to the ls command; ls just happens to be an easy program to show what the wildcard expands to.You can use it with equal effectiveness with other commands, like cp.If we did't have a graphical interface available and we wanted to copy all the jpg files to the ~/lab6/pictures directory you created earlier: that would involve a lot of typing, even with tab completion. Instead we can copy all of them quickly.

  • First run ls and ls *jpg to confirm that *jpg expands to what you want (in this case all the jpg files). You should notice that it doesn't. There are two jpg files which have their JPG in uppercase.
  • If you want to include those as well: use two arguments (bash will actually convert your two arguments to 26 arguments - all the jpg and JPG filenames)
    ls *jpg *JPG
    
  • If you're satisfied with the list of filenames you're getting: you can use the same list with the cp command. The cp command is potentially distructive, since it will overwrite files by default. So think twice before you run it with unknown arguments.
  • Also give cp the extra -v (verbose) argument so that it will print details about what it's doing:
    cp -v *jpg *JPG ~/lab6/pictures/
    
  • Run a command to copy all the wav files to ~/lab6/sounds/WAV/
  • Run a command to copy all the oga files to ~/lab6/sounds/OGA
  • Change your working directory to ~/lab6/sounds/WAV

We want to move the files Front_Right.wav, Rear_Center.wav, Side_Left.wav, Front_Center.wav, Rear_Left.wav, Side_Right.wav, Front_Left.wav, and Rear_Right.wav to the SpeakerTest subdirectory without typing in all those names. Luckily (would you believe it) these specific files have a specific filename pattern which makes it different from all the other files in sounds/: they all contain an underscore.

  • Use ls to check that the *_* pattern will include all the speaker test files, and only the speaker test files:
    ls *_*
    
  • If it looks good: use that argument with the mv command. Note that mv also accepts a -v (verbose) argument:
    mv -v *_* SpeakerTest
    

You should end up with the following arrangement in your sounds directory:

TreeOfSounds.png

The ? wildcard

Like the * wildcard: the ? wildcard stands for any character. But unlike the *: the ? stands for exactly one character.

To illustrate we're going to move all the small square shaped images into a new subdirectory.

  • Create the ~/lab6/pictures/square directory
  • In ~/lab6/pictures: try ls *_*
  • Note that it shows all the files. That's because all of them contain an underscore. But we want only the filenames which begin with a single digit, then have an underscore, and then anything else. Use a ? to get that list:
    ls ?_*
    
    You could make it more specific by adding jpg in the end, but in this case it's not necessary.
  • If the output looks right: move all the square images into the square directory:
    mv ?_* square
    

You should end up with this arrangement:

TreeOfImages.png

Spaces in filenames

At this point in computer history people are used to the idea that a space is nothing special in a filename. That's because most people use GUIs to manipulate files.

On the command-line a space is a very special character which is used all the time: it's the separator between the command and the arguments, and the separator between multiple arguments.

From the filesystem's point of view: there is nothing special about a space, it's just another valid character from the ASCII table. The problem is the interface between the user and the filesystem, in our case that's the shell.

  • Inside your downloaded and extracted SpecialChars directory is a Spaces directory. Copy that to ~/lab6/
  • If you look at the contents in a graphical interface: you'll see there really isn't anything special about those files and directories.
  • In your terminal: change your PWD to ~/lab6/Spaces/
  • Run ls. Note that the names have single quotes around them. This is relatively new behaviour for bash, many systems won't add the quotes. Note also that the spaces between the filenames in the output are no different than the spaces inside the filenames.
  • Try to see the contents of Important Dir using ls
    ls Important Dir
    

It doesn't work because the shell thinks you mean that Important and Dir are separate filenames. To tell the shell you mean the space is part a part of Important Dir: you need to escape the space. There are three ways to do this:

  • Put a \ (backslash) before the special character
  • Enclose the entire argument in double quotes
  • Enclose the entire argument in single quotes
  • Try again to see the contents of Important Dir using ls, but this time escape the space:
    ls Important\ Dir
    
    You may have already found that tab completion does this for you automatically.
  • If the shell encounters a quote (either double quote or single quote): it will assume everything from that quote until the next quote is one argument. For example, you can run:
    ls "Important Dir"
    ls 'Important Dir'
    
  • Both of those will work.

Spaces in filenames are not impossible to deal with on the Linux command line, but they make simple things difficult, and complicated things really difficult.

Escaping other special characters

Like spaces: other special characters can have their special status stripped by escaping them.

  • Inside your downloaded and extracted SpecialChars directory is a WeirdNames directory. Copy that to ~/lab6/

Normally you would avoid creating such filenames if you're expecting to work with them in the command line, but this is a learning exercise.

  • Change your PWD to ~/lab6/WeirdNames
  • Try to run an ls command escaping only the space in Andrew's stuff.txt:
EscapeQuote.png

The shell will give you this strange prompt-like behaviour, as if it's waiting for you to type something in. What it's waiting for is the closing single quote for what looked like an opening single quote.

  • Press Ctrl+c to abort the command you were trying to run.
  • Re-run the ls command, this time escaping not only the space but the single quote as well.
  • Similar problems will the * special character. A * is just a character like any other in the ASCII table. It has special meaning in the shell but not on the filesystem. Try it:
EscapeStar.png

Submit evidence of your work

After you finish the lab: run the following commands to submit your work:

cd ~
wget http://ops345.ca/check/ops145-lab6-check.sh # Download the check script
chmod 700 ops145-lab6-check.sh # Make the downloaded file executable
./ops145-lab6-check.sh # Run the check script

If it says "Your lab6 has been submitted": make a screenshot, and you're done. If it gives you any warnings or errors: you have to fix them and try the ./ops145-lab6-check.sh command again.