Trevis Rothwell's home page

almost, but not quite, entirely unlike tea

Relational Database Programming Class at MIT

IMG_7207I started reading Philip Greenspun’s database-backed web application books and articles around 2005. While I had previously seen database development as the droll, boring corner of computer science, as I realized how useful it could be to route a database through a web site, I became intrigued.

I shortly dove in and tried building some database-backed web sites of my own, more or less following along with Philip’s curriculum for his semester-long class at MIT. I was pleased with what I had learned, but was aware that my self-taught understanding of the database side of things was somewhat superficial.

In January 2012, I had opportunity to take a three-day (all-day) intensive class at MIT, led by Philip, specifically focusing on relational database systems.

Class Format

Perhaps one of the most novel aspects of the class was the overall format. From what I’ve seen, most computer science classes consist of an hour-long lecture in which students are more or less read passages from a textbook, and then given homework assignments to go off and do by themselves.

In this class, Philip gave a mini-lecture of a half-hour or so, then gave the class a short programming exercise based on the lecture material. We were all to bring our own laptop computers to class, and (thanks to the power of Virtual Box, we were all running the exact same software configuration) complete the assigned exercises right there in the classroom. Philip along with a handful of teaching assistants were available to help, and roamed about amongst the students asking us how were were doing if we appeared to be stuck.

After a half-hour or so of letting us churn through the exercise, we pasted our (attempts at?) solutions into a shared Google Docs document, and Philip reviewed and discussed our solutions in front of the class, followed by showing us his own solutions. Then he moved on to the next topic, and the process started over. The pace was fast, and reminded me of the time I spent on the Cornell College block schedule, only more so.

How did this approach work? I think very well. Nearly all of the material we covered in the class was taken either directly or indirectly from Philip’s online SQL book, which I had perused many times while working on my own database projects. But even though I had access to this knowledge, my depth and breadth of study was limited to what I thought I needed to know in order to solve my immediate problem. Being “forced” to work through all of the material in class, even if it appeared too boring or too challenging from a distance, prompted me to learn things that I had never bothered to learn before.

Web Clients vs. Native Applications

IMG_7302

While the class focused on databases, we also took some brief excursions into making end user-facing applications that connect with the databases, both a small web application and a small Android mobile phone application.

What did we learn? It’s much easier to make web applications, and much more cumbersome to make native applications, that do more or less the same thing. Philip wisely pointed out that the web applications he built nineteen years ago still work fine, while he doubts that any mobile phone applications built today will still be usable nineteen years from now.

Which makes more sense to work on?

But Are Relational Database Systems Still Useful?

On Tuesday afternoon, database expert Michael Stonebraker gave a fascinating guest lecture in which he proposed that the standard big-name database systems like Oracle and MySQL might not really be what you want to use.

Modern database usage can be divided into three general areas: online transaction processing, data warehousing, and a variety of other specialized niche uses. Online transaction processing demands high availabilty and fast performance on short, simple queries. Data warehousing demands huge amounts of storage space and running complex queries. Quantitative analysis (such as that done on stock market numbers) demands access to data more like a two-dimensional array than rows of columns in a traditional database.

Systems like Oracle and MySQL can be used to model the data in all of these situations, but they aren’t optimized for any of them. Instead, if your application needs more than Oracle can muster, it is better to use specialized database systems: VoltDB is optimized for online transaction processing, and works as much as 75x faster than Oracle. Vertica provides a 25x performance boost over Oracle for data warehousing applications. For array-based processing, SciDB offers superior performance.

So is Oracle just a worthless piece of junk? Not exactly. There are plenty of applications that do not require a high-performance database. If you don’t need high performance, then Oracle might be a fine choice. An even better choice would be an open-source database system like Ingres or PostgreSQL, since they offer features comparable to Oracle without the hefty licensing fees.

In addition to recommending that we explore diverse kinds of database systems rather than assuming that one size fits all, Dr. Stonebraker admonished us to become well-versed in linear algebra and statistics, as he believed these skills will be increasingly valuable to database developers.

Additional Topics

Ollie the Border Collie

After the final class session, several of us stuck around to chat with Philip and hear his views on sundry other topics, such as startup companies.

Many young programmers today aspire to launch a successful startup company. Of course, many more startups fail in obscurity than strike it rich, but the allure is strong. The recent trend has been to try to acquire some sort of funding and go to work on your startup full-time.

Philip asked, why would you want to work at a poorly-funded startup, working potentially very long hours and always wondering when the horizon of money is going to run out? Wouldn’t it be better to get a job at a well-funded company and work on your product idea in your free time? If your project takes off, then wonderful, but if not, you haven’t put yourself at financial risk.

What about deciding which technologies to learn? All other things being equal, it may be better to develop experience on expensive proprietary technology than on freely-available technology: if a company has the money to pay for the proprietary technology, then (ostensibly) they also have the money to pay you well. Many proprietary software vendors offer free demos; download the demo, build something with the software, and add your newly-acquired skill to your resume.

But even more important than having a laundry list of technologies on your resume is being able to list projects completed. Being able to point to a software product (or a distinct sub-component of one) and say “I built that” is a strong indicator of engineering ability.

Outside of the Class

It seems that different parts of the world are somehow more conducive to certain fields of endeavor than other parts of the world. If you’re into science and engineering, the Massachusetts Institute of Technology is simply an awesome environment to be in. As I roamed the corridors, hung out in the student center, and bought tea in the lobby café, it was as if innovative technological ideas were just floating around in the atmosphere, available to grab and make your own.

Bonus Points: An MIT Hack

MIT students are well-known for performing hacks on campus, and we had opportunity to witness a small one during the class:

IMG_7228 IMG_7313

Final Thoughts

The knowledge I gained in the class was sitting right before me in book form for years. Probably had I only read all of the material, and made a solid effort to work through all of the examples, I would have been much further ahead earlier on. This surely holds true for other areas as well: if a good educator has taken the time to write a good book on a topic that you wish to learn more about, then read the book.

That said, the class was a great experience. The interactive nature lets you learn more easily than through reading alone, and the class structure helps you to stay focused and get through the material.

Not only did I learn more details about database systems, but I feel professionally recharged, and eager to continue on in my career as a software engineer.

More

Filed under: Education , Programming