UWA Logo Computer Science & Software Engineering
C Programming (CITS1210) - Labsheet 4
   Faculty Home  |  CSSE Home  |  csentry  |  CITS1210  |  help1210

Labsheet 4: Reading from Text Files

For the week commencing Monday 25th August.

Most of the following tasks will require you to write functions that read the contents of a text file. Your programs will typically require the C99 functions fopen, fgets, and fclose. Refer to Lecture 4 for an example of how to do this.

Use the sample text file SampleTextFile for testing your programs.


This week's tasks:

Remember that we're using a to indicate if a question is a bit more difficult.
First-time programmers should aim to complete all questions without chilies.

1. Write a function named display that displays the contents of a text file, line-by-line. Your function should take one argument, a character array that represents the name of the file to display, and should return void. Test your function from with a program that accepts a filename as a single command-line argument, passing this string to your display function.
2. The Mac-OSX system, and all Unix/Linux systems, provide a standard command named wc (an abbreviation for wordcount!) which determines the number of lines, words, and characters in a named file. You may read about this command using the online documentation:

    csse2100%  man wc

For this task, and the four following, you will develop your own version of the wc program named mywc.

Firstly, write a function named counter that calculates and returns the number of lines contained within a file. Recall from Lecture 4 that a line in a file may be terminated by either a newline ('\n') or a carriage-return ('\r') character.

As in Question 1, your function should take one argument, a character array that provides a filename. Make your function return void.

Your counter function should print the number of lines in the file before returning to the calling function. Test your function from with a program that accepts a filename as a single command-line argument.

3. Extend your mywc program, and the counter function, so that it now also calculates and prints the number of characters found in the file. As with the standard Unix wc program, all characters (including newline, carriage-return, and whitespace characters) should be counted.
4. Further extend your mywc program, and the counter function, so that it now also calculates and prints the length of the longest line it finds in the file. Note however, this count should not include the newline or carriage-return character, should they be found.
5.
Further extend your mywc program, and the counter function, so that it now also calculates and prints the number of words found in the file. In this context, a word is defined as a continuous character sequence separated by whitespace characters.

Refer to the online documentation (man pages) to learn how the isspace function is useful for this question. Make sure to compare your word count against that reported by the standard wc command.
Hint: this task is easier if your counter function remembers whether it is inside or outside of a word at any time.

6.
Modify your mywc program such that it now accepts command-line arguments that indicate what information should be printed. Assume the last command-line argument is the name of the file to display, but also allow switches (which take the form -X, where X is a character) that controls what is displayed. For example, use the switch -c to request the character count, the switch -l to request the line count, the switch -w to request the word count, and the switch -m to request the maximum line length (note that our use of -m is different to that used in the standard wc command). If invoked with no command-line switches, your program should display all counts.

    csse2100%  ./mywc SampleTextFile
    The file 'SampleTextFile' contains 434 character(s).
    The file 'SampleTextFile' contains 72 word(s).
    The file 'SampleTextFile' contains 12 line(s).
    The longest line in the file 'SampleTextFile' contains 68 character(s).

    csse2100%  ./mywc -m -w SampleTextFile
    The file 'SampleTextFile' contains 72 word(s).
    The longest line in the file 'SampleTextFile' contains 68 character(s).

Note: the order of the switches on the command-line is irrelevant, but they must appear before the filename.

You will need to modify your main function so that it now has (four, if you have completed all the parts above) new Boolean variables (named cflag, lflag, mflag, and wflag) to indicate if the corresponding command-line switch was seen on the command-line. You will also need to modify your counter function so that it now additionally accepts arguments that control which values to display. Do this by adding Boolean arguments that are either true or false depending on whether to display that particular value or not. For example, your function should look like something like:

void counter(char name[], bool displayLineCount, bool displayWordCount,
             bool displayCharCount, bool displayMaxLine)
7.
Write a function named areFilesEqual that determines if the contents of two text files are identical (i.e. if every character in the first file appears in the second file in precisely the same order). Your function should take two arguments: two character arrays that represent the names of the files to compare, and return a Boolean indicating whether the files are identical or not. The calling function (the main function) should report the result returned by your areFilesEqual function. Accept the names of the files as two command-line arguments.

Do this two ways:

  1. Using a loop to iterate over all the characters, performing the comparison character by character.
  2. Using the C99 standard strcmp function to compare two lines (one from each file) directly.
8. Using Lecture 4 as a guide, write a program (and function) named pattern to search through the Unix dictionary (/usr/share/dict/words) for a given word. Your function should take one argument, a character array that represents the word to locate for in the dictionary, and prints out a message indicating whether you found an exact match or not. Accept, as a single command-line argument to your program, the pattern required. For example:

    csse2100%  ./pattern food
    pattern 'food' found in /usr/share/dict/words

    csse2100%  ./pattern foid
    Pattern 'foid' NOT found in /usr/share/dict/words
9. Extend your program from Question 8 such that the provided pattern may contain a period character ('.'), meaning "match any character". In this context, a period means a character must be present at that location in the word, but that character can be any character. For example:

    csse2100%  ./pattern fo.d
    pattern 'fo.d' found in /usr/share/dict/words: fold
    pattern 'fo.d' found in /usr/share/dict/words: food
    pattern 'fo.d' found in /usr/share/dict/words: ford
    pattern 'fo.d' found in /usr/share/dict/words: foud

    csse2100%  ./pattern f...f
    pattern 'f...f' found in /usr/share/dict/words: feoff
    pattern 'f...f' found in /usr/share/dict/words: flaff
    pattern 'f...f' found in /usr/share/dict/words: fluff

    csse2100%  ./pattern j...j
    pattern 'j...j' NOT found in /usr/share/dict/words
10.

Extend your program from Question 9 such that the provided pattern may contain an asterisk character ('*'), meaning "match zero or more characters at this position". In this context, an asterisk can be replaced by any number of (including zero) other characters, all of which may be any other character. For example:

    csse2100% ./pattern 'g*o'
    All words in the dictionary starting with 'g' and ending with 'o' are
    printed (including the word "go")

    csse2100% ./pattern '*a'
    All words in the dictionary ending with 'a' are printed (including the word "a")

    csse2100% ./pattern '*'
    All words in the dictionary are printed.

    Very hard!
    csse2100% ./pattern '*al*cat*'
    All words in the dictionary containing the sub-patterns "al" "cat", each
    separated by zero or more characters.  There are 64 of them in the
    standard Macintosh dictionary.

NOTE: in the examples above, the command-line arguments containing asterisks need to be surrounded within single quotation characters to prevent the shell from attempting to expand them as filename patterns (e.g. proj*t.c).

Top of Page
CRICOS Provider Code: 00126G