CSci 157 Homework #8
This assignment is worth 50 points.
Reading/Resources. Zelle Chapter 10; nested loops/iteration in ThinkCSPY Sections 9.17-9.18. Also review the past few labs and the Nested Structures activity.
Note about grading: Partial credit is always available - a running program that works for some of the problem specification will earn at least 50% of the possible points, assuming a good faith effort. A program that stops with an error will still receive some points based on grading rubric.
Part 1. Working with Data Files (20 points).
- Download the data file ccdata.txt
- Work through the example programs in Chapter 11, Sections 1 through
7: Working
with Data Files but copy the programs and run them on your
computer, using the data file you just downloaded. Make sure ccdata.txt
and the Python programs are in the same folder.
- Create a Python file for each program in the boxed Activity sections, which start in Section 1.4
- Don't skip the first three sections! You'll need to know this information. Also feel free to try out any other code given, to see what it does.
- For the program in Section 11.7, use ccdata.txt instead
of mydata.txt
- You'll turn in all Python files, screenshots of your output, and the
output file emissions.txt; see below.
- (Optional) The File I/O activity offers more practice with this
topic. Work through this activity to make up a missed in-class
exercise (or extra credit if you've completed them all). Turn in the
activity sheet and post the files generated.
Part 2. Nested Loops and Data Files (30 points).
As always, include function signatures and docstrings. Add tests where appropriate.
- (8 points) Programming Exercise 2, page 279 in the textbook. Hint:
use nested for loops and your function from Homework 6. The outer loop
controls the rows, and the inner (nested) loop prints all the
windchill values for a single row.
To print multiple windchill values on a single row, see the description of theprint
function in Section 2.4 of the textbook. This shows a way to useprint
so that it doesn't start on at the beginning of a line each time.
- (8 points) Programming Exercise 14, page 281 in the textbook. This
looks harder than it is. Load and draw an image file in
main
, and call a separate function to do the conversion as shown in the exercise. You'll use nested loops again.
- (14 points) Download the genome
for E. coli file into the folder with your programs for
this homework. The file consists of a sequence of DNA nucleobases
given by the letters AGCT. Write a new program that first reads the
data in this file, line by line, into a single string. Note:
the functions for reading from a file keep the newline character that
marks the end of each line in the string read, and we don't want that,
so use the strip method on each line you read. For example, if
you read a string into a variable
line
, callline.strip()
. To check that reading the file worked, print the length of the string once you've read it; the length should be 5,065,741. Code to do the file reading and length checking should go in the main function.
Now add the following features to this program:
- A function that takes the DNA string as one parameter and a
letter as the second parameter, and returns the number of times
that letter appears in the DNA strand. Note: this function
may take some time given the size of DNA string.
- Since only the letters ACG and T are valid, modify the function
to check that the second parameter is one of those 4 letters before
scanning the string for the letter. Return zero right away if the
letter is not one of the four. The Python keyword
in
may simplify this task, so look it up if you need to.
- Add timing code (as in Lab 10)
to determine how long this functions takes to run. Print the
result in seconds with no more than 3 digits after the decimal
point.
- In
main()
, call your function twice, to determine how many times the letters 'A' and 'T' appear. Output timing results and the count for each letter.
- A function that takes the DNA string as one parameter and a
letter as the second parameter, and returns the number of times
that letter appears in the DNA strand. Note: this function
may take some time given the size of DNA string.
Turning in Your Work
Written Problems. There aren't any written problems this time.
Programming Programs. For Part 1, post your results: screenshots of output and the emissions.txt file. For both Parts, post all Python files (the .py files) to the Assignments page on BrightSpace. Be sure your name is somewhere in the comment (or docstring) at the top of each file.