Stanford - CS145 - Fall 2012 - The Stanford University InfoLab

Wednesday, October 10, 12 

Help Session 2 

The SQL 

Stanford - CS145 - Fall 2012


Poll 

- Who has used SQL before? 

- Who has never used SQL before? 

- And lastly... 

- Anyone secretly a backend ninja merely 

taking this class for the easy units? :) 



Goal for Today 

- No new material, only new insights about 

how to construct queries and avoid 

common pitfalls 



- Quick syntax review 

Agenda 

- Sample problems from “(extras)” 

- Assignment #2 Challenge Problem 

· Debugging on the command line 

- SQLite vs. MySQL vs. PostreSQL 



Digression: SQL Joke! 

- A database engineer walks into a bar and 

sees two tables. 

- Says, “Can I join you?” ^_^ 

(Source: http://ask.sqlservercentral.com/questions/3898/which-is-the-best-sql-joke.html) 

- Moving on... 



Quick Syntax Review 


SQL Query Keywords 

Select From Where Group By Having 


Order By Distinct 

union intersect except 

exists is NULL is not NULL 

{=,,,=} any all 

count() avg() sum() 

max() min() first() last() 


SQL Modification Keywords 

Create Table T1(A integer, ...); 

Drop Table T1; 

Delete From T1 Where [Condition]; 

Update T1 Set A = [Value/Subquery]; 


Insert Into T1 [Subquery] or 

Values([Tuple]); 



Quick Syntax Review 

Done! 



Sample Problems 



Extra Movie-Rating Query Exercises 

Question 1 

- Find the names of all reviewers who rated 

Gone with the Wind. 




Question 1 



Step 1: Look at the schema. 

Movie 

Rating 

rID mID stars ratingDate 

mID title year director 

Reviewer 

rID name 




Question 1 



Step 2: Find your targets. 

Movie 

Rating 



Reviewer 

rID name 




Question 1 



Step 3: Trace a “join route”. 

Movie 

Rating 


Join 


Reviewer 

rID name 




Question 1 



Step 3: Trace a “join route”. 

Join 

Movie 

Rating 



Join 

Reviewer 

rID name 




Question 1 



Step 4: Write that join out. 

Select * 

From Reviewer, Rating, Movie 

Where Reviewer.rID = Rating.rID 

and Rating.mID = Movie.mID; 




Question 1 



Step 5: Add the selection condition(s). 

Select * 



and Rating.mID = Movie.mID 

and Movie.title = 

“Gone with the Wind”; 




Question 1 



Step 6: Project only the columns you need. 

Select Reviewer.name 









Question 1 



Step 7: Resolve duplicates! 

Select Distinct Reviewer.name 








Will it always be that involved? 

- At first, probably...but you’ll get better 

- Have a process, and you’ll rarely go 

wrong! 

- At least, you’ll rarely get stuck (just 

accept now that your first attempt might 

not always work) 




Question 4 

- Find the titles of all movies not reviewed 

by Chris Jackson. 



Sample Problems 

Done! 



A2 Challenge Problem 




- A child goes down to breakfast one 

morning and tells his parents, “I prefer 

Count Star to Count Dracula... 

- ...because I don’t like having to consider 

which Draculas are NULL.” ^_^ 





- A child goes down to breakfast one 

morning and tells his parents, “I prefer 

count(*) to count(Dracula)... 

...because I don’t like having to consider 

which Draculas are NULL.” ^_^ 


- Moving on... 



Create a Test Relation 

- Install SQLite3 (http:// 

mislav.uniqpath.com/rails/install-sqlite3/) 

- Running Mac OS X? You already have it! 

- Run sqlite3 from the command line 

- >> Create Table Edge(n1 

integer, n2 integer); 



Create a Test Relation 

- Install SQLite3 (http:// 

mislav.uniqpath.com/rails/install-sqlite3/) 

- Running Mac OS X? You already have it! 

- Run sqlite3 from the command line 

- >> Create Table Edge(n1 

integer, n2 integer); 

Don’t forget the semicolon! 



Insert Some Data 

1 

3 

>> Insert into Edge values(1,2); 






2 

4 



Let’s Try Something 

- Write a SQL query to find the average outdegree 

of nodes in the graph. 






Step 1: Create a relation mapping 

node IDs to their out-degrees. 






>> Select * 

From Edge 

Group By n1; 

Bad News Bears! 

This returns for each n1 

a single, arbitrary tuple 

having that n1 value. 






>> Select n1, count(*) 

From Edge 

Group By n1; 

Fix: Use an aggregate 

function to map each 

n1 to a meaningful 

summative value. 






- Now, how do we aggregate data from this 

new relation? 

Bring on the FROM clause subquery! 






>> Select * 

From ( 

) R; 

Select n1, 

count(*) as outDegree 

From Edge 

Group By n1 

We need to give this a name 

so we can refer to it 

outside of the subquery. 




- Selecting the avg(outDegree) might 

sound like the next step, but that’s 

actually incorrect! 

- It would fail to consider nodes that have 

zero outgoing edges. 

- How do we account for those? 




- We need another subquery to count the 

number of unique nIDs in the table: 

... 

( 

Select count(*) 

From ( 

Select n1 

From Edge 

) 

) 

... 

union -- Automatically eliminates duplicates. 

Select n2 

From Edge 




- We then use our two subquery results to 

compute the average: 

Select 

(sum(R.outDegree) + 0.0) / 

( 


From ( 

Select n1 

From Edge 


Select n2 

From Edge 

) 

)... 

Hacky way to cast the 

sum as a float. 



The Final Query... 

Select 

(sum(R.outDegree) + 0.0) / 

( 


From ( 

Select n1 

From Edge 

) 


Select n2 

From Edge 

) 

From 

( 

Select n1, count(*) as outDegree 

From Edge 

Group By n1 

) R; 






Last Step: CHECK YOUR ANSWER! 

- Node 1: 3 out-edges + 

Node 2: 2 out-edges + 

Node 3: 1 out-edges + 

Node 4: 0 out-edges = 6 total / 4 = 1.5 



A2 Challenge Problem 

Done! 



MySQL vs. SQLite 

vs. PostreSQL 



MySQL SQLite 

in 

all 

union 

exists 

any 

except 

intersect 

PostgreSQL 



MySQL vs. SQLite 

vs. PostreSQL 

Done! 



- Quick syntax review 

- Sample Problems 

Agenda 

- Assignment #2 Challenge Problem 

- SQLite vs. MySQL vs. PostreSQL 

Done! 



Questions? 



Thanks for coming!

Stanford - CS145 - Fall 2012 - The Stanford University InfoLab

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?