Don't Panic!
You may use only your book (print or pdf), your notes, homework, anything else you have downloaded to your computer and the IPython command prompt.
You may not access the web, instant messaging, phones, or spirit guides. You will pledge on your honor that you have neither given nor received unauthorized help.
You must submit your exam before you leave the room.
Enter your onyen below.
Onyen = 'jpuccio' # not your PID, your onyen
I certify that no unauthorized assistance has been received or given in the completion of this work.
Enter your full name as the value of the Pledge variable below.
Pledge = 'Joseph Puccio' # type your full name between the quotes
# setup
import comp521
check, report = comp521.start('Final')
#setup for the db questions
import sqlite3
from pprint import pprint
# a helper function for testing your queries, do not modify this
def check_query(label, query, points=0):
conn = sqlite3.connect('final.db')
cursor = conn.cursor()
result = cursor.execute(query).fetchall()
print 'your result'
pprint(result)
check(label, result, points=points)
The first part of the exam will consist of multiple choice questions. You'll answer each question by assigning a single character string to the variable just below the question. For example:
Q0_1 How many letters are in SQL?
A) 1
B) 3
C) 2
D) 42
A0_1 = 'B'
Your answer must be one of: 'A' 'B' 'C' 'D' or 'E'. No other answer is acceptable. Each question has a single best answer.
Q1_1. Of the following, which best characterizes the objective of database table normalization?
A) Evaluating queries more efficiently
B) Efficient testing of functional dependencies
C) Creating more independent relations
D) Minimizing redundancy
A1_1 = 'D'
check('Q1_1', A1_1, points=4, choice='ABCD')
Q1_1 answered
Q1_2. What is the primary technique used to normalize tables?
A) Vertical decomposition
B) Horizontal decomposition
C) Table Merging
D) SQL Assertions
A1_2 = 'A'
check('Q1_2', A1_2, points=4, choice='ABCD')
Q1_2 answered
Q1_3. How does concurrency improve database performance?
A) By making transactions durable
B) By making transactions atomic
C) By isolating transactions
D) By allowing aborts of unnecessary transactions
E) By scheduling CPU and disk resources for maximum utilization
A1_3 = 'E'
check('Q1_3', A1_3, points=4, choice='ABCDE')
Q1_3 answered
Q1_4. Suppose that the Strict 2PL scheme is modified to use only exclusive locks, which are requested before both reads and writes. What is the result of this change?
A) Concurrent transactions always deadlock
B) The system functions correctly, but with lower performance
C) The resulting schedules are not serialized
D) It behaves identically to Strict 2PL
A1_4 = 'B'
check('Q1_4', A1_4, points=4, choice='ABCD')
Q1_4 answered
Q1_5. When a transaction is aborted, which of the following may be performed by the DBMS?
A) Changes to pages are undone
B) Its locks are freed
C) It may be rescheduled
D) Other transactions may be aborted
E) All of the above
A1_5 = 'E'
check('Q1_5', A1_5, points=4, choice='ABCDE')
Q1_5 answered
Q1_6. Consider the following valid instance of a relation:
sid name userid age gpa
50000 Dave dave@cs 19 3.3
53666 Jones jones@cs 18 3.4
53688 Smith smith@ee 18 3.2
53650 Smith smith@math 19 3.8
53831 Madayan mada@music 19 1.8
53832 Guldu guldu@music 22 2.0
Which of the following can you infer is not a candidate key based on this instance?
A) sid
B) name
C) userid
D) <name, age>
A1_6 = 'B'
check('Q1_6', A1_6, points=4, choice='ABCD')
Q1_6 answered
Q1_7. Which of the following database properties states that, once committed, data should persist in the database, even if the system crashes?
A) Atomicity
B) Buffer Writes
C) Consistency
D) Durability
E) Equality
A1_7 = 'D'
check('Q1_7', A1_7, points=4, choice='ABCDE')
Q1_7 answered
Q1_8. Of the following disk-transfer timing characteristics, which is likely to be greatest when reading a random block?
A) Seek time
B) Rotational latency
C) Data-transfer time
D) Parity checking
A1_8 = 'A'
check('Q1_8', A1_8, points=4, choice='ABCD')
Q1_8 answered
Q1_9. Which of the following database enhancements created the possibility of deadlock?
A) Error recovery
B) Locks
C) Aborting transactions
D) Committing transactions
E) All of the above
A1_9 = 'B'
check('Q1_9', A1_9, points=4, choice='ABCDE')
Q1_9 answered
Q1_10. Which of the following are not consequences of database redundancy?
A) Redundant storage
B) Update anomalies
C) Insertion anomalies
D) Deletion anomalies
E) Lossless-Join Decompositions
A1_10 = 'E'
#Not sure on this one
check('Q1_10', A1_10, points=4, choice='ABCDE')
Q1_10 answered
Q1_11. If relation R1 has degree D1 and cardinatily C1 and relation R2 had degree D2 and cardinality C2, what is the degree DJ and cardinatily CJ of their natural join (that is joining on equality between fields with the same name)?
A) max(D1, D2) <= DJ <= D1+D2, 0 <= CJ <= C1*C2
B) min(D1, D2) <= DJ <= D1*D2, min(C1,C2) < CJ < max(C1,C2)
C) D1+D2 < DJ <= D1*D2, CJ = C1+C2
D) D1 < DJ < D2, C1 < CJ < C2
A1_11 = 'A'
check('Q1_11', A1_11, points=4, choice='ABCD')
Q1_11 answered
Q1_12. What can you say about the cardinality Co and degree Do of the output of a relational algebra sigma operator when applied to a relation with cardinality Ci and degree Di?
A) Co > Ci, Do < Di
B) Co <= Ci, Do = Di
C) Co >= Ci, Do > Di
D) Co < Ci, Do = Di
E) Co < Ci, Do < Di
A1_12 = 'B'
check('Q1_12', A1_12, points=4, choice='ABCDE')
Q1_12 answered
For this part of the exam you are given a database final.db
with records resulting from
analysis of one day of the server logs for our Tar Heel Reader web site. The database has several tables as shown below:
create table Reads ( -- a record for each book that was read
-- these next 6 fields are when the book was read
-- broken into parts for easy access
year integer, -- 2015
month integer, -- (1-12)
day integer, -- (1-31)
hour integer, -- (0-23)
minute integer, -- (0-59)
second integer, -- (0-59)
ip integer, -- ip address of the computer
slug text, -- key to Books table
primary key (time, ip),
foreign key (slug) references Books,
foreign key (ip) references Locations)
create table Books ( -- real information about each book
slug text primary key, -- unique key for each book
title text, -- book title
author integer, -- key to Authors table
reviewed integer, -- 1 if the book is reviewed, 0 otherwise
pagecount integer, -- number of pages in the book
foreign key (author) references Authors)
create table Authors ( -- fake information about each author
id integer primary key,
name text) -- author name (these may not be unique)
create table Locations ( -- real information about geolocation
ip integer primary key, -- ip address of the computer
country text) -- country eg 'United States'
For simplicity assume that the ip address identifies a unique user. Of course it doesn't for a variety of reasons but we'll simplify for the exam.
For each question you must fill in the SQL query string that will provide the requested result. Simply write your query between the provided quotes, for example:
A2_0 = '''
select * from Authors
'''
The following cell will print a few lines from each table so you can see the format of the fields.
# just a helper to show you the first few records from each table
db = sqlite3.connect('final.db')
print 'Reads'
pprint(db.execute('''select * from Reads limit 3''').fetchall())
print '\nBooks'
pprint(db.execute('''select * from Books limit 3''').fetchall())
print '\nAuthors'
pprint(db.execute('''select * from Authors limit 3''').fetchall())
print '\nLocations'
pprint(db.execute('''select * from Locations limit 3''').fetchall())
db.close()
Reads [(2015, 11, 9, 0, 2, 38, 487960907, u'veterans-day'), (2015, 11, 9, 0, 9, 50, 2078134816, u'pilgrims-3'), (2015, 11, 9, 0, 10, 40, 2078134816, u'journey-to-a-new-land')] Books [(u'tarantulas-3', u'Tarantulas', 7707, 1, 6), (u'the-lost-gerbil', u'The Lost Gerbil', 7707, 1, 8), (u'lets-go-bowling', u"Let's go bowling!", 1595, 0, 16)] Authors [(1, u'Miss Mahalie Kris DDS'), (2, u'Adalynn Shanahan'), (5, u'Vella DuBuque')] Locations [(13613336, u'United States'), (15503167, u'United States'), (17014744, u'United States')]
Q2_1. How many different books were read from Australia?
# no need to write python, just write a single SQL statement.
A2_1 = '''
SELECT COUNT(DISTINCT R.slug) FROM
Reads R, Locations L
WHERE
R.ip = L.ip
AND
L.country = 'Australia'
'''
check_query('Q2_1', A2_1, points=4)
# Your result will be printed below
# It should be a list with a single tuple containing a single integer
your result [(171,)] Q2_1 correct
Q2_2. How many different authors wrote books that are reviewed?
# no need to write python, just write a single SQL statement.
A2_2 = '''
SELECT COUNT(DISTINCT A.id) FROM
Books B, Authors A
WHERE
B.author = A.id
AND
B.reviewed = 1
'''
#I know the distinct isn't really necessary
check_query('Q2_2', A2_2, points=4)
# Your result will be printed below
# It should be a list with a single tuple containing a single integer
your result [(1168,)] Q2_2 correct
Q2_3. How many different countries read books?
# no need to write python, just write a single SQL statement.
A2_3 = '''
SELECT COUNT(DISTINCT L.country) FROM
Reads R, Locations L
WHERE
R.ip = L.ip
'''
check_query('Q2_3', A2_3, points=4)
# Your result will be printed below
# It should be a list with a single tuple containing a single integer
your result [(29,)] Q2_3 correct
Q2_4. What were the top three countries by number of books read? List the name of the country and the number of books they read sorted in descending order by count then ascending order by country name. Don't worry about countries possibly having the same number of books read.
# no need to write python, just write a single SQL statement.
A2_4 = '''
SELECT L.country, COUNT(B.slug)
FROM Locations L, Reads R, Books B
WHERE
L.ip = R.ip
AND
R.slug = B.slug
GROUP BY L.country
ORDER BY COUNT(B.slug) DESC, L.country ASC
LIMIT 3
'''
check_query('Q2_4', A2_4, points=6)
# Your result will be printed below
# It should be a list of tuples containing a string and an integer
your result [(u'United States', 6482), (u'Canada', 410), (u'Australia', 210)] Q2_4 correct
Q2_5. How many books were read per hour for each hour in the day? List the hour (0-23) and the number of books read in order by hour.
# no need to write python, just write a single SQL statement.
A2_5 = '''
SELECT R.hour, COUNT(B.slug)
FROM Reads R, Books B
WHERE
R.slug = B.slug
GROUP BY R.hour
ORDER BY R.hour
'''
check_query('Q2_5', A2_5, points=4)
# Your result will be printed below
# It should be a list of tuples of 2 integers
your result [(0, 40), (1, 22), (2, 28), (3, 19), (4, 65), (5, 14), (6, 51), (7, 73), (8, 374), (9, 666), (10, 798), (11, 1127), (12, 681), (13, 936), (14, 865), (15, 586), (16, 302), (17, 134), (18, 114), (19, 112), (20, 182), (21, 96), (22, 134), (23, 49)] Q2_5 correct
Q2_6. What was the average number of pages of all the books read in each country that read at least 100 books? List the country name and the average number of pages in the books they read. I'm asking for all books they read, not different books.
# no need to write python, just write a single SQL statement.
A2_6 = '''
SELECT L.country, AVG(B.pagecount)
FROM Locations L, Books B, Reads R
WHERE
L.ip = R.ip
AND
R.slug = B.slug
GROUP BY L.country
HAVING COUNT(B.slug)>=100
ORDER BY AVG(B.pagecount) DESC
'''
check_query('Q2_6', A2_6, points=6)
# Your result will be printed below
# It should be a list of tuples with a string and a float
your result [(u'Germany', 16.64814814814815), (u'United States', 12.995988892317186), (u'Australia', 12.947619047619048), (u'New Zealand', 12.848214285714286), (u'Canada', 12.78780487804878)] Q2_6 correct
Q2_7. How many times were books read that have the three-letter sequence 'dog' anywhere in the title? List the title of the book and the number of times it was read. Sort by the title and then the number of times read in descending order.
# no need to write python, just write a single SQL statement.
A2_7 = '''
SELECT B.title, COUNT(*)
FROM Reads R, Books B
WHERE
R.slug = B.slug
AND
B.title LIKE '%dog%'
GROUP BY B.slug
ORDER BY B.title, COUNT(*) DESC
'''
check_query('Q2_7', A2_7, points=8)
# Your result will be printed below
# It should be a list of tuples with a string and an integer
your result [(u'A Service Dog Can Do Many Things', 2), (u'All Dogs', 1), (u'All Dogs Are Different', 3), (u'Bailey The Dog', 2), (u'Caring for a Dog', 2), (u'Cool Dogs We Like', 1), (u"Corey's Dog Book", 1), (u'Counting with Dogs', 1), (u'DOGS are sooooo cute :)', 1), (u'Different Dogs', 1), (u'Dog Party', 1), (u'Dog Therapy', 2), (u'Doggies', 1), (u'Dogs', 5), (u'Dogs', 2), (u'Dogs', 2), (u'Dogs', 1), (u'Dogs', 1), (u'Dogs Are Best', 3), (u"Dogs are Man's Best Friend", 2), (u"Dogs are a Man's Best Friend", 1), (u'Dogs by Emilia', 1), (u'Dogs! Dogs!', 1), (u'Duncan the Wonder Dog', 1), (u'Finding a Lost Dog', 1), (u'Fun Dogs', 1), (u'Funny Dogs', 1), (u'I Like Dogs', 2), (u'I Like Dogs', 1), (u'Lots of Dogs', 1), (u'My Service Dog', 1), (u'Pasteur and the Mad Dog', 1), (u'Pet Dogs', 1), (u'Playing Hide and Seek with Clifford the Big Red Dog', 3), (u'Seaman, the bravest dog on the journey!', 2), (u'Service Dogs', 2), (u'Seven Dogs and More!', 1), (u'Super Dog', 1), (u'The Worst Dog Ever', 1), (u'What Dogs Might Say If They Could Talk', 1), (u'What dogs can do', 2), (u'dogs', 1)] Q2_7 correct
Q2_8. How many ip addresses outside of the United States read at least five different books?
# no need to write python, just write a single SQL statement.
A2_8 = '''
SELECT COUNT(*) FROM (SELECT L.ip FROM
Locations L, Reads R, Books B
WHERE
L.ip = R.ip
AND
L.country != 'United States'
AND
R.slug = B.slug
GROUP BY L.ip
HAVING COUNT(DISTINCT B.slug)>=5)
'''
check_query('Q2_8', A2_8, points=8)
# Your result will be printed below
# It should be a list with a single tuple containing a single integer
your result [(46,)] Q2_8 correct
Q2_9. Which authors wrote books that were read in 10 or more countries. List the author name and the number of different countries in which their books were read. Sort by number of countries in descending order and then by author name in ascending order.
# no need to write python, just write a single SQL statement.
A2_9 = '''
SELECT A.name, COUNT(DISTINCT L.country)
FROM Reads R, Books B, Authors A, Locations L
WHERE
R.slug = B.slug
AND
A.id = B.author
AND
R.ip = L.ip
GROUP BY A.name
HAVING COUNT(DISTINCT L.country)>=10
ORDER BY COUNT(DISTINCT L.country) DESC, A.name ASC
'''
check_query('Q2_9', A2_9, points=8)
# Your result will be printed below
# It should be a list of tuples with a string and an integer
your result [(u'Alf Thiel', 16), (u'Adalynn Shanahan', 14), (u'Dr. Ernesto Cruickshank', 13)] Q2_9 correct
Now, to be really certain you've got it all right, you should:
I hope you have a great break!
report(Onyen, Pledge)
Q1_1 answered Q1_2 answered Q1_3 answered Q1_4 answered Q1_5 answered Q1_6 answered Q1_7 answered Q1_8 answered Q1_9 answered Q1_10 answered Q1_11 answered Q1_12 answered Report for jpuccio Pledged on my honor: Joseph Puccio 12 of 12 answered for up to 48 points 9 of 9 correct, 52 of 52 points
Click the button below to submit your assignment. Watch for a confirmation message that your notebook was successfully uploaded. You may submit as often as you wish, only the last submission will count.