Tutorial 9

Submitting

Regardless of how you choose to complete the assignment, you MUST submit the file you worked on to Canvas.

If you’re submitting remotely, your submitted file will be graded for completeness and correctness via an autograder. Make sure your functions are named correctly and that all of the built-in check-expects pass.

If you’re submitting in-class, make sure to check-in (there will be one question called CHECK-IN Q only available for the first 7 minutes of class) and check-out (there will be a three question survey called CHECKOUT SURVEY only available for the last 10 minutes of class) using PollEverywhere with your location verified.

Getting Started

In this tutorial, you’ll learn to write programs that explore the files and directories (folders) of your computer’s file system. Specifically, you’ll be writing:

functions to explore or search folders for files of a certain name, etc., and
functions to backup folders by copying folders and files from one location to another

Note: This exercise needs the file_operations.rkt library, which is already required at the beginning of the starter code.

So our goals are:

Understand files, folders, and path names
Traverse the file system of our computers
Understand why manipulating files and folders is a type of imperative programming
Practice writing functions that manipulate files

Tutorial 9 Starter Files

Intro: File System Terminology

There are three concepts from file systems we are using here: files, folders, and paths. File systems are structured as trees, so this should feel familiar from past assignments. If you feel comfortable with file system data structures, feel free to skip this section.

files have names and contain data, like the .rkt file you’re currently working in. (These are like leaf nodes on a tree.)
folders, also known as directories, have names and contain two types of data:
- Files
- Other folders Since they can contain other folders, they’re a recursive structure, and form a tree.
paths represent where to find files in a file system. They consist of a sequence of folder names, and may or may not end in a file name. For example:
- /test/test_2/bar.txt says there is a folder test, which contains a folder called test_2, which contains a file called bar.txt and we are referring to this file.
- /test/test_3/ says there is a folder test, which contains a folder called test_3, and we are referring to this folder.

Paths in ASL

In the pre-recorded lecture (and most of the time in real life), we see paths written as strings, with folder names separated by a /. In ASL, paths are their own data type. They are NOT strings for good reasons as you’ll see below.

To create a new path, use build-path:

; build-path : string ... -> Path
(build-path "test" "test_2" "bar.txt") ; /test/test_2/bar.txt
(build-path "test" "test_3") ; /test/test_3/

You can also convert between the two paths and strings using the functions string->path and path->string:

(define p (build-path "test" "test_3"))
(path->string p) ; "test/test_3"
(string->path "test/test_2/bar.txt") ; #<path:test/test_2/bar.txt>

Note: On macOS and Unix, paths look like /folder1/folder2/folder3/folder4, while Windows paths look like c:\folder1\folder2\folder3. However, in programming, usually the Windows style backslash \ is replaced with the Unix style forward slash /. So, c:/folder1/folder2/folder3 is a valid Windows path in Racket.

Paths can be either absolute paths or relative paths. Absolute paths start with / on macOS/Linux and C:\ on Windows, which mean the top-level directory of your hard drive. Relative paths do not start at the top of the file system. Instead, they start at what we call the “current” directory. So, a path test-folder/file.txt (note no / or c:\ at the beginning) says “file.txt, which is inside of test-folder, which is inside of the directory where I currently am”. The current directory is (while in DrRacket) typically the directory where the current open file is saved to.

!CAREFUL! - Imperatives Mean Business

We’ve modified the file operations that change the disk, like delete-file!, to only work on files and directories that are at in the same directory or sub directories as your assignment file. This is to prevent you from accidentally deleting files elsewhere on your computer. This means that they only work on relative paths. In this assignment, you will use the sample filesystem we have created for you to test with. But be careful...our safeguards for delete-file! do not prevent you from accidentally deleting your TUTORIAL FILE. BE CAREFUL NOT TO DO THIS! Make sure that whenever you close DrRacket, your tutorial file still exists. #imperativeprogramming

Because these are structs, we can have specialized accessors that allow us convenient access to parts of a Path. For instance, the path-filename function!

; path-filename : Path -> string
; Returns just the filename part of a path.

Part 1. Backing up files

Before writing anything for yourself, your first goal should be to understand the backup! function included in the template file.

You can run the function like so:

(backup! (string->path "test") (string->path "output"))

This will take all the files in the test subdirectory of the tutorial template folder and copy them into another subdirectory called output.

Screenshot of Initial Directory Structure

Note, this assumes you have opened your RKT file and run it at least once (that's where that compiled folder comes from and where the ~ file comes from!). Screenshot of provided test directories

You should see the following in the interactions window while running it:

Copying file test\foo.bmp to output\foo.bmp
Copying file test\test.txt to output\test.txt
Copying file test\test_2\bar.txt to output\test_2\bar.txt
Copying file test\test_2\foo.bmp to output\test_2\foo.bmp

File Operations Functions Cheatsheet

; copy-file! : Path Path [boolean] -> void
; Effect: copies the data specified at the 1st Path to the 
;   location in the 2nd path. If the 3rd arg is included
;   and set to #true, any existing file at the 2nd Path
;   will first be deleted before the copy is completed.

; file-exists? : Path -> boolean
; Returns #true if a file exists at the given Path

; delete-file! : Path -> void
; Effect: deletes the specified file

; file-or-directory-modify-seconds : Path -> number
; Returns the time the file was last modified. For historical
;  reasons this is the number of seconds since January 1, 1970.

; rename-file-or-directory! : Path Path -> void
; Effect: changes the name or the location (directory) of the
;   given file (1st Path) to the 2nd Path.

Directory Operations Functions Cheatsheet

; make-directory! : Path -> void
; Effect: creates a new directory with the specified Path
;  as long as the parent directory already exists!

; delete-directory! : Path [boolean] -> void
; Effect: deletes a directory at the given Path.
;   If the second argument is included and #true, then
;   it will delete all files and subdirectories in there.

; directory-exists? : Path -> boolean
; Returns #true if there is a directory by that name

; directory-files : Path -> (listOf Path)
; takes a directory path, and returns a list of
; paths to the files contained in that directory.
> (directory-files (build-path "test"))
(list #<path:test/test.txt> #<path:test/foo.bmp>)

; directory-subdirectories : Path -> (listOf Path)
; Takes a directory path, and returns a list of paths
; to the sub-directories contained in that directory.
> (directory-subdirectories (build-path "test"))
(list #<path:test/test_2> #<path:test/test_3>)
> (directory-files (build-path "test" "test_2"))
(list #<path:test/test_2/bar.txt> #<path:test/test_2/foo.bmp>)

Activity 1.1. Skipping existing files

Right now, backup! will work just fine. However, if you run it a 2nd time, it will re-copy all of the files from the 1st directory to the 2nd directory regardless of whether or not they exist already in the destination!

Your task is to modify backup! so that it only copies a file if it does not already exist in the destination directory. The file-exists? function (see the files cheatsheet above) might be useful here!

Hints!

First, identify the part of the existing program that is currently in charge of copying files.
Because it's a sequence of imperatives, (first print the file's name then copy it), it's wrapped in a begin
You want to conditionally execute this imperative statement. Wouldn't it be convenient if we had a conditional specifically built for imperatives that allowed us to say "do this unless this is true"

Activity 1.2. Testing your new `backup!`

Testing with imperatives has been a mess in previous weeks, but now its even worse. Automated tests with files are quite complex because other programs on our computers might also be affecting the state of our directories at the same time. Instead of writing check-expects, you can test your fixed function by:

First running (backup! (string->path "test") (string->path "output"))
To see if it doesn’t doesn’t copy files unnecessarily, create a new file (either through ASL or through another application on your computer), such as new-file.txt, somewhere within the test directory. Rerun the command above, and verifying that new-file.txt gets copied into output, but none of the other files are re-copied
To verify it still copies files when it needs to, delete one of the files from output, and re-run the command. The deleted file should be re-copied.

Activity 1.3. Updating stale backups

Copy your fixed definition of backup! and make an identical function named backup-new! Now modify this new function to copy files that exist in the output directory, but have been modified in the original since they were last backed up.

In the previous question, we modified backup! to avoid making copies needlessly, but we made it too aggressive… now, if we make a backup and then change the original file, backup! wouldn’t copy the revised version into the output folder. We want to fix this in backup-new!.

Essentially you need to modify backup-new! to copy files when either:

The file does not exist in the backup directory (you did this in 1.1).
The file already exists in the backup directory, but the original file has been modified since the backup was created (i.e. the date modified of the original is larger than the date modified of the backup)

The file-or-directory-modify-seconds function will be very helpful here (checkout the File Operations Functions cheatsheet above)!

Hints

On Windows, copying a file gives it the same “Date Modified” as the original, so make sure you do not copy the file if the modification times are the same.
If you are getting the following error: file-or-directory-modify-seconds: error getting file/directory time ...system error: No such file or directory; errno=2
Remember that order matters in imperative code, and it’s important to check whether the file exists BEFORE you check when it was last modified. (You can’t ask the operating system for the date modified of a nonexistent file!) So you should structure your condition like this: (and (file-exists? ...) ...check last modification time...)
You won’t have to worry about the error, because and will evaluate conditions in order and bail early as soon as one condition returns false.

Activity 1.4. Testing `backup-new!`

To test this function:

Check for regressions by repeating all the tests from the backup! question, to make sure no previously working code has been broken.
Verify the new function re-copies files that have been modified: Modify one of the files in the origin test directory, for example, by editing test.txt, running backup-new! and checking to ensure the file was copied over.

Part 2. Searching for Files

Activity 2.1. `count-files`

Write a function, count-files, that takes the pathname of a directory as input and returns the number of files within the directory and all its subdirectories (and their subdirectories, recursively).

; count-files : path -> number
; Takes a path to a directory, and returns the number
; of files within the directory and all its descendants,
; recursively.

So (count-files (string->path "test")) would return 4, assuming you hadn’t modified the test directory.

Of by one error?

Due to a hidden MacOS system file called .DS_Store you may see a result here that is one more than the number of visible files. Windows will sometimes also create a hidden file called Thumbs.db. So don’t panic if you see a number of files that looks one or two files too large. It’s probably giving the right answer; it’s just including files you can’t see. You can use directory-files to check the files that appear in a given directory and look to see if it’s listing one of these hidden files. Alternatively, you can configure your operating system to display them in folders. To view hidden files on MacOS, use the keyboard shortcut Cmd+Shift+. while in the Finder (see here for more details). To view hidden files in Windows, check the “Hidden items” box in the “View” tab of File Explorer (see here for more details).

Hint 1

Write a simple function called count-files to count the number of files in a directory itself (i.e. not the subdirectories).

Hint 2

Modify it to call itself recursively for each subdirectory. You can do this by using map to recursively call count-files on each path in the list of subdirectories!

Hint 3

Finally, use the provided sum function to combine the result of recursive calls. Remember (map count-files ...) will return a list of results!

Note: this is a little weird since it’s a recursion that doesn’t require you to write the base case and use an if to keep the function from recursing infinitely. If you call directory-subdirectories on a directory with no subdirectories, it will return the empty list and so map won’t attempt to recurse any further. Since map has an if that checks for the empty list, that’s enough to stop the recursion.

Activity 2.2. `concat`

Write a function, concat, that takes a list of lists of paths and concatenates them in the given order to produce a list of all given paths.

; concat: (listof (listof path)) -> (listof path)

So:

(concat (list (list (string->path "test"))
              (list (build-path "test" "test_2")
                    (build-path "test" "test_3"))))

would produce the list:

(list (build-path "test")
      (build-path "test" "test_2")
      (build-path "test" "test_3"))

Activity 2.3. `all-directories`

Write a function called all-directories that takes a path to a directory and returns a list of paths of all directories and subdirectories (and their subdirectories, recursively). The result should contain the input directory itself. While the actual order of paths may be different, the elements in the list should be the same.

; all-directories : path -> (listof path)
; Returns a list of paths of all directories within the given directory,
; including the given directory itself.
> (all-directories (string->path "test"))
(list #<path:test> #<path:test/test_2> #<path:test/test_3>)

Note that the path of the input directory, #<path:test>, is also included in the output.

Hints!

The path of the input directory is always in the output list. Therefore start your solution by creating a list containing the input path.
Augment the function to call itself recursively for each subdirectory. You can do this by using map to recursively call all-directories on each path in the list of subdirectories
Finally, use concat to merge all the lists of pathnames from the recursive calls together and append the result to the one-element list in step 1. Note that merely calling append on the result of recursion cannot work.

append only merges lists one level deep, so passing it lists of lists will return a list of lists:

> (append (list (list 1 2) (list 3 4))
          (list (list 4 5) (list 6 7)))
(list (list 1 2) (list 3 4) (list 4 5) (list 6 7))

And calling it with just one list, i.e. the result of the recursive calls, returns it unchanged:

> (append (list (list 1 2) (list 3 4)))
(list (list 1 2) (list 3 4))

Therefore we call concat with a list of lists to merge the results:

> (concat (list (list (string->path "test"))
                (list (build-path "test" "test_2")
                      (build-path "test" "test_3"))))
(list #<path:test> #<path:test/test_2> #<path:test/test_3>)

Activity 2.4. `search-file-name`

Write a function called 1 that takes a search string and a directory path, and returns a list of paths of files in the directory whose names contain the given string. Important: it should only return paths of files whose filename contains the string, not whose paths happen to contain the string elsewhere.

As with previous functions, you need to search the given directory and all its subdirectories.

; search-file-name : string, path -> (listof path)
; Returns a list of paths to files within the original directory, whose
; filenames contain the given string.
> (search-file-name "foo" (string->path "test"))
(list #<path:test/foo.bmp> #<path:test/test_2/foo.bmp>)
; Only search filenames, NOT folder names!
> (search-file-name "test" (string->path "test"))
(list #<path:test/test.txt>) ; test/test_2 and test/test_3 should not show up

Useful functions from the Library:

; path-filename: Path -> Path
; takes a path to a file, and returns a path containing only the filename.
> (define file (string->path "test/test_2/foo.bmp"))
> file
# <path:test/test_2/foo.bmp>
> (path-filename file)
# <path:foo.bmp>

; path->string: Path -> String
; takes a path, and returns a string version.
> (path->string (string->path "some/path/to/file"))
"some/path/to/file"

string-contains?: String, String -> Boolean
Takes a query string and a string to search, and returns #true if the second string contains the query:
> (string-contains? "ack" "Racket")
# true
> (string-contains? "Racketttt" "Racket")
# false

Hints!

As before, you can follow this recipe to break down your function:

1. Write search-file-name to only search the files immediately contained by the original directory. Ignore subdirectories for now.

> (search-file-name "foo" (string->path "test"))
(list #<path:test/foo.bmp>) ; only one file
6

2. Now use map to recursively call search-file-name on all of the subdirectories. Note: since search-file-name itself returns a list of paths, calling map will return a list of lists of paths (woah):

Note: map can’t call search-file-name directly, since search-file-name takes two inputs and map only calls the function you give it with one argument (a list element). So you should use lambda to create a new one-argument procedure that calls search-file-name. (Sound familiar? Think back to the homework with artist-is-multigenre?)

Finally, use concat to merge all the lists of pathnames! * * * ## Appendix. Writing Tests for Files #### Writing Tests Involving Paths ASL prints the internal representation of paths as #. However, the printed output is not a valid expression syntax. If directly embedding it, there will be syntax errors like `read-syntax: bad syntax '#<'` Use `string->path` and `build-path` to express the expected output, e.g. `(string->path "test")`.

Submitting

Getting Started

Intro: File System Terminology

Paths in ASL

Part 1. Backing up files

Activity 1.1. Skipping existing files

Activity 1.2. Testing your new backup!

Activity 1.3. Updating stale backups

Activity 1.4. Testing backup-new!

Part 2. Searching for Files

Activity 2.1. count-files

Activity 2.2. concat

Activity 2.3. all-directories

Activity 2.4. search-file-name

Activity 1.2. Testing your new `backup!`

Activity 1.4. Testing `backup-new!`

Activity 2.1. `count-files`

Activity 2.2. `concat`

Activity 2.3. `all-directories`

Activity 2.4. `search-file-name`