Submitting
Regardless of how you choose to complete the assignment, you MUST submit the file you worked on to Canvas.
If you’re submitting remotely, your submitted file will be graded for completeness and correctness via an autograder. Make sure your functions are named correctly and that all of the built-in check-expects pass.
If you’re submitting in-class, make sure to check-in (there will be one question called CHECK-IN Q only available for the first 7 minutes of class) and check-out (there will be a three question survey called CHECKOUT SURVEY only available for the last 10 minutes of class) using PollEverywhere with your location verified.
Getting Started
In this tutorial, you’ll learn to write programs that explore the files and directories (folders) of your computer’s file system. Specifically, you’ll be writing:
- functions to explore or search folders for files of a certain name, etc., and
- functions to backup folders by copying folders and files from one location to another
Note: This exercise needs the
file_operations.rktlibrary, which is already required at the beginning of the starter code.
So our goals are:
- Understand files, folders, and path names
- Traverse the file system of our computers
- Understand why manipulating files and folders is a type of imperative programming
- Practice writing functions that manipulate files
Intro: File System Terminology
There are three concepts from file systems we are using here: files, folders, and paths. File systems are structured as trees, so this should feel familiar from past assignments. If you feel comfortable with file system data structures, feel free to skip this section.
- files have names and contain data, like the
.rktfile you’re currently working in. (These are like leaf nodes on a tree.) - folders, also known as directories, have names and contain two types of data:
- Files
- Other folders Since they can contain other folders, they’re a recursive structure, and form a tree.
- paths represent where to find files in a file system. They consist of a sequence of folder names, and may or may not end in a file name. For example:
-
/test/test_2/bar.txtsays there is a foldertest, which contains a folder calledtest_2, which contains a file calledbar.txtand we are referring to this file. -
/test/test_3/says there is a foldertest, which contains a folder calledtest_3, and we are referring to this folder.
-
Paths in ASL
In the pre-recorded lecture (and most of the time in real life), we see paths written as strings, with folder names separated by a /. In ASL, paths are their own data type. They are NOT strings for good reasons as you’ll see below.
To create a new path, use build-path:
; build-path : string ... -> Path
(build-path "test" "test_2" "bar.txt") ; /test/test_2/bar.txt
(build-path "test" "test_3") ; /test/test_3/
You can also convert between the two paths and strings using the functions string->path and path->string:
(define p (build-path "test" "test_3"))
(path->string p) ; "test/test_3"
(string->path "test/test_2/bar.txt") ; #<path:test/test_2/bar.txt>
Note: On macOS and Unix, paths look like
/folder1/folder2/folder3/folder4, while Windows paths look likec:\folder1\folder2\folder3. However, in programming, usually the Windows style backslash\is replaced with the Unix style forward slash/. So,c:/folder1/folder2/folder3is a valid Windows path in Racket.
Paths can be either absolute paths or relative paths. Absolute paths start with / on macOS/Linux and C:\ on Windows, which mean the top-level directory of your hard drive. Relative paths do not start at the top of the file system. Instead, they start at what we call the “current” directory. So, a path test-folder/file.txt (note no / or c:\ at the beginning) says “file.txt, which is inside of test-folder, which is inside of the directory where I currently am”. The current directory is (while in DrRacket) typically the directory where the current open file is saved to.
!CAREFUL! - Imperatives Mean Business
We’ve modified the file operations that change the disk, like
Because these are structs, we can have specialized accessors that allow us convenient access to parts of a Path. For instance, the path-filename function!
; path-filename : Path -> string
; Returns just the filename part of a path.
Part 1. Backing up files
Before writing anything for yourself, your first goal should be to understand the backup! function included in the template file.
You can run the function like so:
(backup! (string->path "test") (string->path "output"))
This will take all the files in the test subdirectory of the tutorial template folder and copy them into another subdirectory called output.
Screenshot of Initial Directory Structure
Note, this assumes you have opened your RKT file and run it at least once (that's where that compiled folder comes from and where the ~ file comes from!).
You should see the following in the interactions window while running it:
Copying file test\foo.bmp to output\foo.bmp
Copying file test\test.txt to output\test.txt
Copying file test\test_2\bar.txt to output\test_2\bar.txt
Copying file test\test_2\foo.bmp to output\test_2\foo.bmp
File Operations Functions Cheatsheet
; copy-file! : Path Path [boolean] -> void
; Effect: copies the data specified at the 1st Path to the
; location in the 2nd path. If the 3rd arg is included
; and set to #true, any existing file at the 2nd Path
; will first be deleted before the copy is completed.
; file-exists? : Path -> boolean
; Returns #true if a file exists at the given Path
; delete-file! : Path -> void
; Effect: deletes the specified file
; file-or-directory-modify-seconds : Path -> number
; Returns the time the file was last modified. For historical
; reasons this is the number of seconds since January 1, 1970.
; rename-file-or-directory! : Path Path -> void
; Effect: changes the name or the location (directory) of the
; given file (1st Path) to the 2nd Path.
Directory Operations Functions Cheatsheet
; make-directory! : Path -> void
; Effect: creates a new directory with the specified Path
; as long as the parent directory already exists!
; delete-directory! : Path [boolean] -> void
; Effect: deletes a directory at the given Path.
; If the second argument is included and #true, then
; it will delete all files and subdirectories in there.
; directory-exists? : Path -> boolean
; Returns #true if there is a directory by that name
; directory-files : Path -> (listOf Path)
; takes a directory path, and returns a list of
; paths to the files contained in that directory.
> (directory-files (build-path "test"))
(list #<path:test/test.txt> #<path:test/foo.bmp>)
; directory-subdirectories : Path -> (listOf Path)
; Takes a directory path, and returns a list of paths
; to the sub-directories contained in that directory.
> (directory-subdirectories (build-path "test"))
(list #<path:test/test_2> #<path:test/test_3>)
> (directory-files (build-path "test" "test_2"))
(list #<path:test/test_2/bar.txt> #<path:test/test_2/foo.bmp>)
Activity 1.1. Skipping existing files
Right now, backup! will work just fine. However, if you run it a 2nd time, it will re-copy all of the files from the 1st directory to the 2nd directory regardless of whether or not they exist already in the destination!
Your task is to modify backup! so that it only copies a file if it does not already exist in the destination directory. The file-exists? function (see the files cheatsheet above) might be useful here!
Hints!
- First, identify the part of the existing program that is currently in charge of copying files.
- Because it's a sequence of imperatives, (first print the file's name then copy it), it's wrapped in a
begin - You want to conditionally execute this imperative statement. Wouldn't it be convenient if we had a conditional specifically built for imperatives that allowed us to say "do this unless this is true"
Activity 1.2. Testing your new backup!
Testing with imperatives has been a mess in previous weeks, but now its even worse. Automated tests with files are quite complex because other programs on our computers might also be affecting the state of our directories at the same time. Instead of writing check-expects, you can test your fixed function by:
- First running
(backup! (string->path "test") (string->path "output")) - To see if it doesn’t doesn’t copy files unnecessarily, create a new file (either through ASL or through another application on your computer), such as
new-file.txt , somewhere within the test directory. Rerun the command above, and verifying thatnew-file.txt gets copied into output, but none of the other files are re-copied - To verify it still copies files when it needs to, delete one of the files from output, and re-run the command. The deleted file should be re-copied.
Activity 1.3. Updating stale backups
Copy your fixed definition of backup! and make an identical function named backup-new! Now modify this new function to copy files that exist in the output directory, but have been modified in the original since they were last backed up.
In the previous question, we modified backup! to avoid making copies needlessly, but we made it too aggressive… now, if we make a backup and then change the original file, backup! wouldn’t copy the revised version into the output folder. We want to fix this in backup-new!.
Essentially you need to modify backup-new! to copy files when either:
- The file does not exist in the backup directory (you did this in 1.1).
- The file already exists in the backup directory, but the original file has been modified since the backup was created (i.e. the date modified of the original is larger than the date modified of the backup)
The file-or-directory-modify-seconds function will be very helpful here (checkout the File Operations Functions cheatsheet above)!
Hints
- On Windows, copying a file gives it the same “Date Modified” as the original, so make sure you do not copy the file if the modification times are the same.
- If you are getting the following error:
file-or-directory-modify-seconds: error getting file/directory time ...system error: No such file or directory; errno=2 - Remember that order matters in imperative code, and it’s important to check whether the file exists BEFORE you check when it was last modified. (You can’t ask the operating system for the date modified of a nonexistent file!) So you should structure your condition like this:
(and (file-exists? ...) ...check last modification time...) - You won’t have to worry about the error, because and will evaluate conditions in order and bail early as soon as one condition returns false.
Activity 1.4. Testing backup-new!
To test this function:
- Check for regressions by repeating all the tests from the
backup!question, to make sure no previously working code has been broken. - Verify the new function re-copies files that have been modified: Modify one of the files in the origin test directory, for example, by editing
test.txt, runningbackup-new!and checking to ensure the file was copied over.
Part 2. Searching for Files
Activity 2.1. count-files
Write a function, count-files, that takes the pathname of a directory as input and returns the number of files within the directory and all its subdirectories (and their subdirectories, recursively).
; count-files : path -> number
; Takes a path to a directory, and returns the number
; of files within the directory and all its descendants,
; recursively.
So (count-files (string->path "test")) would return 4, assuming you hadn’t modified the test directory.
Of by one error?
Due to a hidden MacOS system file called .DS_Store you may see a result here that is one more than the number of visible files. Windows will sometimes also create a hidden file called Thumbs.db. So don’t panic if you see a number of files that looks one or two files too large. It’s probably giving the right answer; it’s just including files you can’t see. You can use
Hint 1
Write a simple function calledHint 2
Modify it to call itself recursively for each subdirectory. You can do this by usingHint 3
Finally, use the providedNote: this is a little weird since it’s a recursion that doesn’t require you to write the base case and use anif to keep the function from recursing infinitely. If you calldirectory-subdirectories on a directory with no subdirectories, it will return the empty list and somap won’t attempt to recurse any further. Since map has an if that checks for the empty list, that’s enough to stop the recursion.
Activity 2.2. concat
Write a function, concat, that takes a list of lists of paths and concatenates them in the given order to produce a list of all given paths.
; concat: (listof (listof path)) -> (listof path)
So:
(concat (list (list (string->path "test"))
(list (build-path "test" "test_2")
(build-path "test" "test_3"))))
would produce the list:
(list (build-path "test")
(build-path "test" "test_2")
(build-path "test" "test_3"))
Activity 2.3. all-directories
Write a function called all-directories that takes a path to a directory and returns a list of paths of all directories and subdirectories (and their subdirectories, recursively). The result should contain the input directory itself. While the actual order of paths may be different, the elements in the list should be the same.
; all-directories : path -> (listof path)
; Returns a list of paths of all directories within the given directory,
; including the given directory itself.
> (all-directories (string->path "test"))
(list #<path:test> #<path:test/test_2> #<path:test/test_3>)
Note that the path of the input directory, #<path:test>, is also included in the output.
Hints!
- The path of the input directory is always in the output list. Therefore start your solution by creating a list containing the input path.
- Augment the function to call itself recursively for each subdirectory. You can do this by using
map to recursively callall-directories on each path in the list of subdirectories - Finally, use concat to merge all the lists of pathnames from the recursive calls together and
append the result to the one-element list in step 1. Note that merely callingappend on the result of recursion cannot work. -
append only merges lists one level deep, so passing it lists of lists will return a list of lists:And calling it with just one list, i.e. the result of the recursive calls, returns it unchanged:> (append (list (list 1 2) (list 3 4)) (list (list 4 5) (list 6 7))) (list (list 1 2) (list 3 4) (list 4 5) (list 6 7))Therefore we call> (append (list (list 1 2) (list 3 4))) (list (list 1 2) (list 3 4))concat with a list of lists to merge the results:> (concat (list (list (string->path "test")) (list (build-path "test" "test_2") (build-path "test" "test_3")))) (list #<path:test> #<path:test/test_2> #<path:test/test_3>)
Activity 2.4. search-file-name
Write a function called 1 that takes a search string and a directory path, and returns a list of paths of files in the directory whose names contain the given string. Important: it should only return paths of files whose filename contains the string, not whose paths happen to contain the string elsewhere.
As with previous functions, you need to search the given directory and all its subdirectories.
; search-file-name : string, path -> (listof path)
; Returns a list of paths to files within the original directory, whose
; filenames contain the given string.
> (search-file-name "foo" (string->path "test"))
(list #<path:test/foo.bmp> #<path:test/test_2/foo.bmp>)
; Only search filenames, NOT folder names!
> (search-file-name "test" (string->path "test"))
(list #<path:test/test.txt>) ; test/test_2 and test/test_3 should not show up
Useful functions from the Library:
; path-filename: Path -> Path
; takes a path to a file, and returns a path containing only the filename.
> (define file (string->path "test/test_2/foo.bmp"))
> file
# <path:test/test_2/foo.bmp>
> (path-filename file)
# <path:foo.bmp>
; path->string: Path -> String
; takes a path, and returns a string version.
> (path->string (string->path "some/path/to/file"))
"some/path/to/file"
string-contains?: String, String -> Boolean
Takes a query string and a string to search, and returns #true if the second string contains the query:
> (string-contains? "ack" "Racket")
# true
> (string-contains? "Racketttt" "Racket")
# false
Hints!
As before, you can follow this recipe to break down your function:
1. Write
> (search-file-name "foo" (string->path "test")) (list #<path:test/foo.bmp>) ; only one file 6
2. Now use map to recursively call search-file-name on all of the subdirectories. Note: since search-file-name itself returns a list of paths, calling map will return a list of lists of paths (woah):
Note:
Finally, use