My Summer Vacation: Programming at Indigo Bioautomation

This summer I had a fun opportunity to program in a space I’ve never really explored before: medical data analysis. The company was Indigo Bioautomation. They build cloud-based software for analyzing mass spectrometer data. Here’s a pretty good introduction if you’re curious:

Despite medical tech’s reputation for somewhat old-school and slow software approaches, Indigo definitely feels like a startup. They are using new technology: TokoMX, a heavily modified ruby-on-rails stack, Backbone.js, etc. They also have an agile in-house process pretty similar to other places I’ve worked.

One thing that was different though: automated testing. The Indigo guys are totally on board and have tons of automated tests for everything. It’s a different mindset: I definitely had to be reminded to write more tests several times during my tenure. There’s a lot of good, but also some bad – in particular, long test runtimes (like 5+ minutes) are a huge problem there despite a lot of careful design work to optimize. And of course bugs still slip through in various places…though they definitely tended to be more obscure and less “this feature doesn’t even sort of work” like I’ve seen elsewhere. My views on automated testing have definitely become a lot more nuanced – and I’m constantly thinking of ways I want to revise what we talk about in the SE curriculum to reflect that.

Indigo really let me explore their system. I worked on a big command line tool, in their Java-based rules engine, writing MongoDB queries, front-end javascript, and lots of ruby coding both on webservers and services. The opportunity to get down-and-dirty with ruby was particularly fun: I really think I see now how it differs from python in actual practice.

Overall I learned plenty and I think Indigo is happy with what I produced for them. I’m missing my students and looking forward to being back at Rose in a few weeks though.

Good Open Source Projects for Programming/Refactoring Assignments

So CSSE375 is a Rose-Hulman course focused on refactoring (course notes here). As a result, I like to have the students make changes to existing open source projects. This gives them a little experience with codebases of a realistic size, and it makes the assignments feel a little more authentic.

I generally like Java projects that can be made to build automatically in Eclipse. Not by any means a deal breaker but Java is the language they know best so it lets me focus on the concepts easiest. Eclipse is not really a requirement but it has to build really easily because otherwise I end up debugging 45 different weirdo build problems.

Here’s a few that have worked for me:

  • Argo UML – a really big project which is great (although it makes submissions a pain). A UML editor with some very weird use of objects in places. Here’s the assignment I used – and here’s the eclipse workspace source I used in case you don’t want to check it out yourself. The force based aspect of the assignment is maybe too fussy an algorithm…if I had to do it again I might do something a little simpler.
  • Cleansheets – a great java spreadsheet that is quite reasonably designed. Builds very easily. The assignment I used was this one but I’ve since released the solution so I wouldn’t go reusing it on anything important. But I would use it again on another assignment.
  • jFTP – this one’s codebase has a LOT of duplication in it…great if you want to pratice some refactoring techniques. Here’s the assignment I used and I modified the codebase slightly to make testing harder and make the project build easily in Eclipse.
  • BORG – another large codebase, in this case a calendaring program. Unfortunately, annoying to build because students must install the lombok library. Here’s an assignment I did using it anyway.

If you’re looking for a more straightforward refactoring assignment, here’s what I started my students with – Calender Parse Part 1 & Part 2.

Programming Assignments for Programming Language Paradigms

This winter I taught a course called Programming Language Paradigms. This is a language-oriented course where students learn several interesting programming languages and discuss the various features of these languages. The goal is to learn the various paradigms of programming through actually using the languages to solve the kinds of problems that they are well suited for (you can see the complete notes for the course, if you’re interested). One of the consequences of this was that I had to develop a lot of assignments in each of the languages. Of course, I didn’t develop them from scratch – I dug around and bummed what I could from courses I found online. But there’s not as many assignments out there as there used to be, partly because I think many college courses now use Moodle and so are automatically walled off. But I figured at the very least I could contribute a few.

Prolog Assignments

Prolog turned out to be a great language to start with – absolutely different than what they had seen before, and fun to solve the problems.

  1. Maze Problem simple pathfinding, first assignment in prolog. Main thing is that it’s designed to force to you encounter infinite prolog loops earily.
  2. Word Find a program that generates all possible “word finds” for a list of words. Main thing is that it forces you to carefully manage unbound variables.
  3. Grade NLP final prolog project. In practice, it was a bit too open ended – lots of folks didn’t end up doing anything too cool. I think next time I’ll do a more traditional sentence structure NLP, plus more hints for how to structure the code. I think if I were to do it again I’d follow this assignment more closely.

Erlang Assignments

Erlang was maybe the least well-loved of the assignments, but I think that may be more due to the complexity of managing multithreaded programming.

  1. List Problems first functional-ish language so needed to give them some practice with iterators.
  2. Simple Communication in class activity where students practiced setting up multiple processes and communicating
  3. Merge sort a multi-process merge sort. Students don’t actually have to implement merging or sorting, but the communication is tricky. This one has some basic unit tests.
  4. Paxos final project, implementing the paxos algorithm in erlang. Includes some serious unit testing. But this assignment was probably too hard – I think I might do the raft algorithm next time which supposed to be simplier.

Elm Assignments

Elm was liked by folks who enjoyed making games, less liked by others.

  1. Circle following mouse simple but gets the idea of signals across
  2. Line drawing adds in the state signal
  3. Spacewar more complex state, getting ready for the final project
  4. Scrolling Video game the assignment itself seemed to work pretty well. Nobody asked to do something different than the scroller.

Other in-class activities

  1. Lua/C integration activity
  2. An in class thing about making monads I did with Haskell

I have solutions for many of these assignments as well (excluding the really open-ended ones) – as long as you’re a professor or something similar. Email me.

Catan GUI in Elm

I’m in India, but there’s some downtime and I’ve been using it to play a with a toy project – a Settlers of Catan game built in the pure FRP web language Elm. Well, perhaps “game” is to strong of a word for what I’ve done thus far – lets say Settlers of Catan GUI.

Click the image to see the code/try it out!

Click the image to see the code code/try it out!

I really liked using a boardgame to try out Elm. It’s a fun thing to try BUT the mechanics of the game itself force you to confront real problems. The examples on the very nice elm website are cool, but once I tried to actually engage with a data model more complex than just a few coordinates, things got pretty crazy.

One thing I can say is that the mechanics of FRP really encourage you to come up with a can GUI/model seperation. I eventually came up with this idea of hoverable stuff – which in theory the model can communicate to the GUI all the board objects that can be manipulated, then the GUI itself does the highlighting, then when the user clicks the selected object is communiated back to the model – which adjusts its internal state and then gets re-rendered. Not bad.

Some of the other stuff is not so fun. Some of is just book keeping – variables are not mutable, so when the state needs to be updated you can’t just update it. Instead you need to return a fully new model object, based on the old model object but with the variable changed. But when your model actually is complex, this is very annoying – the value you need to change is one variable in a record in a dictionary in the overall model. So you need a function that takes a record and returns a new record, a function that takes a dictionary and returns a new dictionary, and a function that takes a model and returns a new model. Also, if you have dictionaries of various keys and values, these functions are not easily reusable – although maybe you could do something with type parameterization somehow.

Other stuff is more than annoyance. Every function must be pure, so random numbers are hard. In elm, you need a random “signal”. But unfortunately, signals don’t have values at initialization time – they only exist when your program is actually “running” (i.e. a very funky pure functional transformation of input signals to GUI signals). So if your board arranged randomly – prepare for some serious sadness. I did figure out a way around it, but it was anything but pretty.

Anyways, I had some fun with elm. I’m not sure if I am actually going to complete the catan game – most of the rest of it seems unlikely to teach me new stuff and board games have a lot of complexity. But I did learn a lot, and I think I may ask students to build a (much smaller project) in elm as part of an upcoming programming language course I’ll be teaching in the winter.

(Contains 1 attachments.)

ICER 2014: How CS Undergraduates Make Course Choices

So the second main part of my dissertation research was accepted into ICER 2014! Here’s a pre-publication version if you’re interested.

Students in most CS curricula have to make a wide variety of educational decisions including what courses to take. Frequently, they must make these decisions based on a very limited knowledge of the content of the topics they are choosing between. In this paper, I describe a theory of CS undergraduate course choices, based on 37 qualitative interviews with students and student advisors, analyzed with grounded theory. Most students did not have specific educational goals in CS and, as long as their classes were enjoyable, tended to assume that any course required by the curriculum had useful content (even if they could not articulate way). Particularly enjoyable or frustrating courses caused them to make long term course/specialization decisions and use a more strategic goal–oriented approach.

My Summer Vacation: Programming at Groupon

After focusing on teaching and research for the last few years, I wanted to get back to the swing of things and write some real code. Lucky for me, my good friend Ben twisted a few arms and got me a summer job with his team at Groupon (lest you doubt Groupon’s hiring practices…yes there were some technical interviews in there too). I worked on the Breadcrumb Pro team. Our app looked like this:


It’s a restaurant point-of-sale application – the system restaurants use to enter in everybody’s orders, track employee hours, generate reports, etc. This is part of Groupon’s big plans to create a vast and profitable empire in business software. From a technical side, we’re talking a 100% classic client server app – iOS on the frontend, python on the back.

Ok, I admit a sense of pride looking at my commit history.  Good to see I can still code when the situation requires it! :)

Ok, I admit a sense of pride looking at my commit history. Good to see I can still code when the situation requires it! :)

I had a great time and was pretty productive. I was initially worried three months might not be enough ramp-up time for me to be of value to my team. Within a month I coding fast and signing up for high priority stories just like everybody. It actually sort of helped that I was not particularly familiar with either iOS or python…I took stories on both sides and that really help me understand the overall architecture. Especially on the fast development cycles we were on, there wasn’t a heck of a lot of spec design. So oftentimes a single story would require a change all the way from the DB to the frontend – just understanding every layer and changing it was way easier than attempting to coordinate a hand off.

I think I’ll definitely give my students even more practice with understanding big systems – maybe even toss in one or two multilingual codebases. At the very least I managed to get a few more war stories to tell them about. By then end of it, both me and Ben were wondering why summer internships at other companies wasn’t standard practice for professional software engineers.

Of course, even though I had a great time at Groupon, I’m still very much looking forward to teaching again in a few weeks.

ICER 2013: Undergraduate Conceptions of the Field of Computer Science

I had a great time at this year’s ICER conference – met up with some old friends and talked with some new ones about all sorts of CS education stuff. I also published a paper entitled Undergraduate Conceptions of the Field of Computer Science:

Students come to CS from a variety of backgrounds and with a variety of preconceptions. Some initially select CS with a very vague idea of the field they are majoring in. In this paper, I describe CS undergraduates’ view of the field of Computer Science. The approach was qualitative and cognitive: I studied what students think CS is and how students reasoned about their courses and curriculum. Through the use of grounded theory in 37 qualitative interviews with students and student advisors, I extracted three different conceptions about CS found in undergraduate CS majors using Grounded Theory. Overall, students had reasonable views of CS at a high level but lacked specifics. Students had difficulty describing subfields of CS or anticipating the content of courses they selected.

You can download a copy if you’re curious, or see all the good stuff from ICER 2013 from the complete proceedings.

My students are awesome: CSSE 376 Board Games

So in my quality assurance course, I ask my students to form groups of 2-3 and build computerized versions of board games. I selected board games in particular because I knew board game logic was both understandable AND tricky enough that it could generally benefit from robust unit testing (which was what I was mainly asking them to practice for this project).

I was very happy with what my students produced. Every team worked well and reliably turned out code week after week – perhaps partly because they knew I was monitoring their github commits :) . These games are really quite complex…if you look at these videos I think you’ll see tons of detailed (and tricky to implement) logic.

And…lest you think I was slacking on the testing…each of these games has greater than 75% code coverage…not at all bad when you consider the complexity of the GUIs which are mostly untested.

Anyways, check out their stuff!

CivilizationMovieFinal from Spencer Murphy on Vimeo.

Reflection on Summer 2012 GHP Classes

I’m a big believer in student feedback. Even though students often can’t articulate the source of problems, going through and reflecting is an essential part of figuring out what I want to try going forward. So even in an environment like GHP where I’m not asked to formally collect evaluations, I always do. Of course, it really helps that at GHP in particular students are relentlessly positive. Seriously my worst review this summer was “not my favorite class by any means, but still good”. I wish that’s what my worst university eval looked like.

BUT, looking at the feedback (here and here), one thing that I am struck by is how much more satisfied by student feedback that is about course content and not about me. This is something that I think the Wicked Teacher of the West said first…but I can’t find the blog post now. When I student says “It’s a really great course” that of course makes me a little happy. But when a student says “I thought it was really neat how you could prove problems are incomputable by reducing them to other incomputable problems” that makes me think I did my job. It’s always about the student’s relationship with the content, not the student’s relationship with you.

I see that a lot more in my theory of computation course than I do in my fractals class – that makes me suspect I’m doing a better job in ToC.

Theory of Computation was different this year, I think because I didn’t have the same core of super-strong students who were really loving the course. There’s definitely a class culture that develops, and I’m not yet attuned enough to think seriously about how I can help its development. The course was still good, and I think I was able to smooth out some rough edges for some students. Remaining challenging however, is the issue of the two main proof techniques we do in class: the pumping lemma and incomputability proofs. If I teach this course again I’m gonna at least crack the pumping lemma.

Fractals was better this year. I found it less stressful, and I think the students learned more. Fractal dimension seemed like a big hit this time around, so I we can explore that a little more. We kicked things off with some very simple feedback functions and Chaos – I think that helped people get on board at the get go. I think affine transformations needs more exploration discussion and play. A little tricky without computers some times.

Changing the Emacs Modeline Color in a Buffer

I wanted to have the color of my term-mode mode line switch, depending on whether I was in character mode or line mode. I’m not sure why this became an overwhelming desire of mine, but it did. Not really knowing anything about emacs font faces, themes, or anything I pulled up my debugger and started spelunking in color-theme-buffer-local. I discovered a mysterious function face-remap-add-relative designed to do exactly what I wanted – let me remap a particular part of a single buffers theme and then undo that mapping when finished.

A smarter person might have checked the emacs manual and saved himself a lot of time.

Here’s how it works:

;; set the modeline background color and save a "cookie" so the change can be undone
(setq old-term-color (face-remap-add-relative 'mode-line :background "dark goldenrod"))

;; undo that change later
(face-remap-remove-relative old-term-color)

Interestingly, I’m pretty sure this ability to change colors for a specific buffer is one of the new features of Emacs 24. Yay progress.