Interactive SQL in R | Jason Bryer

Interactive SQL in R

I recently taught a very basic introduction to SQL workshop and needed a way to have participants interact with SQL statements. Obviously there are lots of tools to interface with a database, but since we are all R users I thought it would be nice to be able interact without leaving R. Although this interface is fairly basic, the fact that we can type in a SQL statement and get the results as an R data frame provides all the advantages of having data in R. Moreover, I found this to be an interesting exercise in see the power of R as programming language, not just as statistical software. The function described here is part of the sqlutils package which was created to manage a library of SQL files. More information about that is provided on the project page and I will likely have a forthcoming blog post too.

First we need to create a database to interact with. In this example we will use the students data frame from the retention package. We will save this data frame into a SQLite database using the RSQLite package. The R code to setup the database is provided as a demo in the package. Type demo('isql') to start.

require(sqlutils)
require(RSQLite)
require(retention)
data(students)
students$CreatedDate = as.character(students$CreatedDate)
m <- dbDriver("SQLite")
tmpfile <- tempfile('students.db', fileext='.db')
conn <- dbConnect(m, dbname=tmpfile)
dbWriteTable(conn, "students", students[!is.na(students$CreatedDate),])

We begin an interactive SQL environment with the isql function. The only required parameter is conn which is the connection to the database that SQL statements will be executed. The sql parameter is optional and sets the initial SQL statement for the session that can be edited or executed.

> hist <- isql(conn=conn, sql=getSQL('StudentSummary'))
Interactive SQL mode (type quit to exit, help for available commands)...
SQL>
help
   Command      Description
   ___________  ______________________________________________________
   quit         quit interactive mode
   help         display this message
   sql          enter SQL statement
   edit         edit SQL in a separate text window
   print        print the last entered SQL statement
   exec         execute that last entered SQL statement
   result       prints the last results
   save [name]  save the last executed query to the global environment
SLQ>
print
SELECT CreatedDate, count(StudentId) AS count FROM students GROUP BY CreatedDate ORDER BY CreatedDate
SLQ>
edit

SQL Edit Window

SLQ>
print
SELECT CreatedDate, count(StudentId) AS count FROM students GROUP BY CreatedDate ORDER BY CreatedDate
SLQ>
exec
Executing SQL...
118 rows of 2 variables returned
SLQ>
save
Data frame sql.results saved to global environment
SLQ>
quit

The isql function returns the history of the session invisibly (that is the results will not be printed but can be assigned to a variable). There are two elements in the returned list, commands is a character vector listing all the commands entered and sql is a character vector containing all the SQL statements entered.

> names(hist)
[1] "sql"      "commands"

Related

comments powered by Disqus