Karl R. Wurst

I am Professor of Computer Science at Worcester State University.

Dec 282013
 

Now that we have the CS Department’s GitLab server set up, and CS-140 Lab 1 is rewritten and tested using the new server, I’ve started to think about how to automate my interactions with the server. I had already  written some Bash scripts to interact with the Bitbucket server to get student code, convert it to PDF files, and put it back on the server after grading. Those scripts should still work fine with GitLab, since it’s just git on a different server.

One thing that I had not been able to automate previously is the step of issuing a pull request for students to merge my grading branch into their repository. This was not too much of an issue when there were only 6 students in the summer class (so only 3 repositories per lab assignment), but it was going to take more time with ~48 students in the spring class. While reading RSS feeds, I came across a post mentioning the GitLab API. This could be the solution to my problems! And there’s a Python module for the API! I had already been writing Python scripts to make my grading easier, and had been starting to rewrite my Bash scripts in Python.

I started playing with the GitLab API in Python, and had managed to create a merge request (GitLab’s term for pull request.) I had also noticed that you could create GitLab accounts through the API. This seemed like something I should pursue – creating ~48 accounts per semester seemed like something that should be automated.

Since I intended to post my code on Github, one of the first issues I had to address is how to avoid publishing my private token for GitLab. I could have put in a dummy token before pushing my code, but I would have to remember to do that before every time I committed my code. The solution to this issue was solved through the use of the .gitignore file. If I put my token into a file, then I could add a line to my .gitignore file so that it would not be committed.

# Private GitLab Token - not to be stored in repository #
########################################################
gitlabtoken.txt

Then I could just read the token out of the file, and use that string.

# Get my private GitLab token
# stored in a file so that I can .gitignore the file
token = open('gitlabtoken.txt').readline().strip()

After importing the pyapi-gitlab module, I could use that token, along with the server’s URL to create a GitLab object. Notice, that I had to turn ssl verification off, since we only have a self-signed certificate.

# Create a GitLab object
# For our server, verify_ssl has to be False, since we have a self-signed certificate
git = gitlab.Gitlab(GITLAB_URL, token, verify_ssl=False)

Creating a user account is pretty simple using the API:

# Create the account  
success = git.createuser(name, username, password, email)

The returned success value is a boolean — either it worked, or it failed (but you can’t tell why…).

One thing that’s a bit odd about the createuser call, is that you have set a password for the user, but the notification email to the doesn’t include the password. (If you create a user account from the web interface, it generates a random password, includes it in the notification email to the user, and requires the user to change their password when first logging in.) And, the password you set doesn’t seem to work either!

So, I’m just telling the students that they should use the “Forgot Password” link to have a password reset email sent to them, and then proceed from there. (If this is ever fixed, I’ll have to generated a random password.)

Getting the class list as a CSV file from the Blackboard Grade Center is pretty easy, and the first three rows contain the student’s last name, first name, and username. I can use those three strings to generate the name, username, and email needed for the createuser API call.

The only challenge with processing the CSV file is that Blackboard puts some strange character at the beginning of the file, so the file has to be opened with utf-8 encoding. (And the header line needs to be thrown away.)

The last thing I wanted to add is a way to have optional verbose output, so that I could see if the user creation was working. (I decided that it should always notify the user if the account creation failed.)  To do this I had to learn two new things about Python: how to parse arguments1, and how to send output to stderr.

I used the argparse module:

import argparse
# Set up to parse arguments
parser = argparse.ArgumentParser()
parser.add_argument('filename', help='Blackboard CSV filename with user information')
parser.add_argument('-v', '--verbose', help='increase output verbosity', action='store_true')
args = parser.parse_args()

and used the verbose argument to determine what to print:

if not success:
    sys.stderr.write('Failed to create acccount for: '+name+ ', '+username+', '+email+'\n') 
elif args.verbose:
    sys.stderr.write('Created account for: '+name+', '+username+', '+email+'\n')

Full code is on Github here.

  1. I already knew how to do simple argument parsing, but I wanted to learn how to deal with optional arguments.
Dec 132013
 

Downloading student assignment files from Blackboard as a single zip file saves a lot of time — you don’t have to individually open each “attempt”, download the file (renaming it in the process, so you don’t keep overwriting the previous file, since they are all named “Homework1.pdf” 😉 ), and then move on to the next one. Instead you get one convenient .zip file that contains all of the assignment files.

Unfortunately, Blackboard does some other things that make your life a bit more difficult. Once you unzip the file, you will find:

  1. The student files are renamed from filename.ext to assignmentname_username_attempt_datetime_filename.ext
  2. A text file is created for each student named assignmentname_username_attempt_datetime.txt even if the student has not entered any text data or comments.

Checking all of the text files to see if they really contain a comment and deleting those that don’t, and renaming all of the assignment files to username.ext so that I can start grading them 1 This process takes 15 minutes or more per assignment, which certainly lowers my enthusiasm for grading.

Today, I decided that I should write some code to automate this task. The time it would take to write the script would be recouped in only a few assignments. I decided to write the script in Python because I could easily see how to do the string manipulations. My shell scripting string manipulations are not as good. I would have to learn how to do the file system manipulations in Python, but I figured that would be relatively simple.

The first step is getting a list of all the files in the directory (leaving out all of the subdirectories)2:

onlyfiles = [ f for f in os.listdir(dir) if os.path.isfile(os.path.join(os.curdir,f)) ]

The next step is filtering that list to get just the .txt files:

txtfiles = [ f for f in onlyfiles if '.txt' in f ]

Then you can search the contents of the textfiles. You’ll notice that there are two characteristic phrases that indicate no text data and no comments. You can just delete the files that contain both of those:

for f in txtfiles:
    file = open(f)
    contents = file.read()
    file.close()
    if 'There are no student comments for this assignment' in contents and \
       'There is no student submission text data for this assignment.' in contents:
        os.remove(f)
        print('Deleted', f)

After refreshing the list of files to be just the remaining files, you can go about renaming the files. They all have _attempt_ embedded in their filename. Then you want to strip off everything up-to-and-including the first underscore, and from the second underscore up to the file extension. Then rename the file.

for f in onlyfiles:
    if '_attempt_' in f:
        first = f.find('_') # location of first underscore
        second = f.find('_',first+1) # location of second underscore
        extension = f[f.rfind('.'):] # get file extension
        newf = f[first+1:second] + extension
        os.rename(f, newf)
        print('Renamed', f, 'to', newf)

There are probably other features I can add, but this works well enough for now. Back to grading…

Full code is on GitHub here.

  1. I may still have to convert some of them to PDFs, if the students have not followed instructions, since I grade them by marking up the PDFs on my iPad. But that’s something I’ll tackle later. For my programming classes, I do that with my grading scripts which are still a work-in-progress.
  2. http://stackoverflow.com/a/3207973
Dec 102013
 

Chad Day recently completed the installation of our new GitLab server (read about it here and here.) This project was precipitated by some issues I had been having in trying to teach the use of git earlier in our curriculum. I had been having the CS-401 Software Development Process students use git and github in their FOSS projects, but it was difficult for them seeing git for the first time and expecting them to use it intensively in a project in the same semester. They had asked a number of times, “Why don’t you teach this in an earlier course?”

So, I decided to try using it in the first programming course – CS-140 Introduction to Programming. While they don’t do any large projects in CS-140, they do work on their weekly labs as Pair Programming. Using git for the collaboration aspect (so they don’t have to keep emailing versions back-and-forth to each other) and as a way to submit their completed lab assignments to me (so I only have to receive one copy of the assignment per pair) seemed to make a lot of sense. In addition, I had attended a workshop at CCSCNE 2013 entitled “Git on the Cloud” which provided a methodology to do just that, which had encouraged me further.

The “Git on the Cloud” workshop suggested using Bitbucket, since it allows an unlimited number of private repositories.1 I’m very willing (in fact, I often require) students to make their code for senior-level projects public with an open source license. But private is important for coursework at the freshman level.

On the other hand, the “Git on the Cloud” methodology involved using a single repository per student, and a different branch for each assignment(!)  In other words, whenever you change branches/assignments, all of your other code goes away, and is replaced with the code for the current assignment.

After discussing this with Chad and Dillon Murphy, we decided that this was too confusing, and gives students an incorrect idea of how git should be used. Also, it would only work in pairs if the students worked in the same pair throughout the semester, and I like to have my students switch partners for each lab. So, I wrote my lab instructions using Bitbucket, but one repository per assignment, per pair.

I tried it out during my summer 2013 section of CS-140. It was a nice testbed — with only 6 students it was not too difficult to work out the bugs in the procedures. In another post I’ll explain how I had the students use the repository, and how I processed the repositories for grading (including the scripts that I wrote with some help from Stoney Jackson on a train ride from Providence to Philadelphia.)

The problem came when I decided to try it with my CS-135 Programming for Non-CS Majors class in Fall 2014. Soon after we started the first lab — the git lab — the student who had also been in the summer class could not add her lab partner as a collaborator to her Bitbucket repository. After a bit of investigation, we determined that while Bitbucket allows unlimited private repositories, you can have at most 5 collaborators — per account, not per repository — without paying. That had not been a constraint with a class of 6, but it certainly was with a class of 24, and it would just get worse as the students progressed to other courses.

At this point, I gave up on git for the semester and started looking for alternatives. I had used Gitolite in the past, which had worked well but had no web interface. I wanted something more like Github, and came across GitLab. I added installing GitLab to Chad and Dillon’s project of building a number of new servers for the department (see here and here.)

Once Chad had finished the GitLab install and worked out the kinks, we decided to test it by running through the CS-140 Lab 1 using the new server. I quickly updated the lab assignment to refer to a repository on our GitLab server, set up the repository on the new server, and had Dillon and Chad work their way through the lab to look for problems. They found a few typos, and a number of places where I had not replaced all the references to Bitbucket with GitLab, but otherwise it worked.

We had only one puzzling issue — Dillon was able to push changes to a repository he should not have had sufficient privileges to modify. It turns out that (not surprisingly) if you are a GitLab adminstrator, you have the ability to push to any repository on the server.

I’m looking forward to testing the server on a larger scale with 48 students in CS-140 starting in January.

  1. I’m aware that students can get multiple private repositories from GitHub, but that requires contacting GitHub and asking. The Bitbucket option just seemed easier.
Dec 022013
 

In my CS-135 Programming for Non-CS Majors class, one of the primary objectives for the students is to learn to work with collections of data in files. I’m always happy when this requires manipulations that can’t be performed with other tools that the students are comfortable with — thus motivating the need to learn to code.

This afternoon in class, students were working in groups on their final projects. Two groups came up against some problems in getting their data into a format that could be easily processed in Python. Both cases involved data that was only available in the form of PDF files.

The old standby of selecting text and pasting it into Excel did not provide nice columns of information. Our second attempt was to export the data as text.

Case 1

In the first case, we got text data that looked like:

Biology 306 N/A 306
Biotechnology 80 26 106
Business Administration 748 N/A 748
Chemistry 141 N/A 141
Communication 245 N/A 245
Communication Sciences & Disorders 218 N/A 218
Community Health 158 N/A 158
Computer Science 116 N/A 116
Criminal Justice 445 N/A 445
Early Childhood Education 80 19 99
Early Childhood Education, Non-Licensure 26 N/A 26

This looked promising – we’ve dealt with one-record-per-line-space-delimited data files in class before. You just need to read a line at a time, and use Python’s string split method to turn it into a list… But — wait! — the first item  is a variable number of words separated by spaces. That will make for some messy lists — they’ll all be of different lengths:

['Communication', '245', 'N/A', '245']
['Communication', 'Sciences', '&', 'Disorders', '218', 'N/A', '218']
['Community', 'Health', '158', 'N/A', '158']

Here’s the solution: Python lists can be indexed from the end using negative indices. So, we can definitely get at the last three values (numbers of majors — undergraduate, graduate, and total). Assuming a list in a variable department, they are at positions department[-3], department[-2], and department[-1] respectively.

But, what about the department name, which may be in multiple list items? Well, we can get it as a sub-list, using list slicing: department[:-3] yields:

['Communication']
['Communication', 'Sciences', '&', 'Disorders']
['Community', 'Health']

All that’s left is to concatenate them together into a single string:

name = ''
for item in department[:-3]:
    name = name + item + ' '

Full code is here: https://gist.github.com/kwurst/7761789

Case 2

In the second case, we got text data that looked like:

Boston    00350000    4368    65.9    15.2    0.8    2.1    15.9    0.1
Boston Collegiate Charter (District)    04490000    34    67.6    32.4    0.0    0.0    0.0    0.0
Boston Day and Evening Academy Charter (District)    04240000    162    13.0    55.6    0.0    6.8    24.7    0.0
Boston Green Academy
Horace Mann Charter School
(District)    04110000    72    70.8    26.4    0.0    1.4    1.4    0.0
Boston Preparatory Charter Public (District)    04160000    27    74.1    11.1    0.0    3.7    11.1    0.0
Bourne    00360000    145    90.3    4.8    0.0    2.1    2.8    0.0
Braintree    00400000    369    95.1    3.3    0.3    0.3    1.1    0.0

Which could be fixed the same way, except for the fact that some of the district names ended up broken across multiple lines. (I’m not sure why this happened, and it turned out that exporting the data in a different way fixed the problem. But I’d already found a solution, so I’m going to document it here…)

Working from the assumption that the district org code always starts with a zero (I know — not a good assumption, but it works in this case…), the solution involves checking for lines with no zero in them and concatenating them together. Then you can treat the lines as in Case 1.

for line in f:
    while line.find('0') == -1:
        line = line + f.readline()

Full code is here: https://gist.github.com/kwurst/7761789

Oct 312013
 

Some of you will remember the post My Year of Open Source from 1 January 2011 – almost 3 years ago – where I made a New Year’s resolution to participate more in FOSS. Here are the goals I listed for myself for that year:

I have four main goals (at this point):

  1. Learn the tools and processes myself by participating in a FOSS project.
  2. Figure out what FOSS tools and processes I can begin to introduce my students to in earlier courses.
  3. Figure out what FOSS experience(s) I can provide my non-CS students.
  4. Find a project (or projects) to place my Senior CS students into in Spring 2012.

Well, it was as successful as most New Year’s resolutions – meaning, not very. Or maybe, not completely. I was (partially) successful at some of those goals, although almost none were completed within the year that I so rashly promised.

Figure out what FOSS tools and processes I can begin to introduce my students to in earlier courses.

This one was somewhat successful, although not until this past June (2013) when I managed to have my summer Introduction to Programming class (all six students!) use git and Bitbucket to collaborate with their lab partners and to submit their work to me for grading. Fresh from that (small-scale) success, I tried to have my Programming for Non-CS Majors class do the same, and ran into some scaling issues. We’re working on the solution for that right now – more in a future post.

My Spring 2013 capstone project course did use git and GitHub for our project developing an app for a Worcester Art Museum exhibit. But my understanding of git was not a good as it could have been and the student use of git was spotty. We also planned to use Pivotal Tracker, but didn’t get very far. We did successfully use IRC, however.

Find a project (or projects) to place my Senior CS students into in Spring 2012.

My Spring 2012 capstone project course worked with Eucalyptus, and had some pretty strong interaction with some of the members of the community, but I think that both the students and I felt we weren’t as successful as we could have been due to some technical issues early on in the course. For Spring 2013, I abandoned working in an existing FOSS project in favor of new development when the Worcester Art Museum opportunity presented itself. We did, however, make our code freely available (https://github.com/CS-Worcester/JILOA)

Figure out what FOSS experience(s) I can provide my non-CS students.

This goal got very little attention, other than my abortive attempt at using git in the Programming for Non-CS Majors course.

Learn the tools and processes myself by participating in a FOSS project.

And I still have not made any real progress in my own participation in a FOSS project.

However, that’s all going to change. Stay tuned for My Year of Open Source v2…

 

Oct 302013
 

Our CS 401 Software Development class was canceled on Monday, 11 February 2013 due to ongoing snow removal and cleanup on campus from the Nemo blizzard. (Worcester received 28.5 inches of snow in just about 24 hours.) This is a problem for a class that meets only on Mondays, especially with the next Monday being a holiday.

As soon as the campus closing was announced on Sunday afternoon, I emailed the students to announce that we would replace class the next day with an IRC (Internet Relay Chat) meeting. (Actually, that’s a lie. The first thing I did was panic, then I screamed, then I ranted to my family about the injustice of cancelling my Monday-only class. Then I thought about holding class on IRC…) Here is the message I sent the students on our class listserv:

Campus is closed tomorrow, so we will not have class. We will not have class next week either due to the President’s Day holiday.

This is going to seriously mess up our schedule. I’ll think about how we can carry on in the two weeks.

Let’s try to hold an IRC chat tomorrow during class time (2:00pm-4:30pm). I’ll send out instructions on installing (optional) and using an IRC client later tonight. I have instructions already written up, I just have to find them, possibly update them, and send them out.

Holding class on IRC would be a little bit of a challenge since the students had not used IRC yet, so this would have to serve as both an IRC familiarization exercise and a useful meeting. I sent them the following message to prepare them:

We are going to meet today on IRC (Internet Relay Chat) at 2:00pm.

You should read through this in advance so that you are prepared. Especially if you are going to install an IRC client – you will need time to set it up. I suggest trying this out at least 1/2 hour in advance to make sure you get the connection working. I’ll stay on IRC all day so you can try out chatting.

You have two choices for connecting to the IRC server:

  1. Install an IRC client. There are many available, you may want to try a few to see which you like the best. Some are standalone applications, and some are browser plugins (like Chatzilla for Firefox.) I’ve heard that mIRC is the most popular for Windows, I use Colloquy on the Mac.
    Here are some of the most important settings you will need. How you set these will depend on your client. You will want to install your client and do the setup in advance of our meeting, so you aren’t late.

    1. Server: irc.freenode.net
    2. If you can set a port, you may want to use 7000 since it can be used for an SSL connection.
    3. Nickname: Choose your own*
    4. Channel: ##WSU-CS401
  2. Use the webchat page on freenode: https://webchat.freenode.net
    1. Nickname: Choose your own*
    2. Channels: ##WSU-CS401
    3. Complete the reCAPTCHA
    4. Connect

* You may want to register your nickname, so that no one else can use it. That way we can all get used to looking for a specific nickname for you. See the instructions: http://freenode.net/faq.shtml#registering

IRC Resources

The most important commands which chatting:

  • /SERVER new-server-hostname
  • /NICK new-nickname
  • /QUIT
  • /JOIN #channelname
  • /ME does something
    This command is used for saying that you are doing something like:
    /ME is looking for that information in my email
  • /LEAVE

Chatting:

  • If you want to address your comments to everybody, just type your comment and hit return.
  • If you want to address your comments to a specific person, type their nickname followed by a colon, then your message. E.g.

         kwurst: I have the answer to your question

I had created a course-specific channel on freenode last spring, so we could use that channel, but to hold a useful meeting, felt that it would be vital to have a MeetBot running to take minutes. I could have used used the #teachingopensource channel, which has zodbot installed, but then the minutes would be saved on Fedora’s website, rather than ours. So I decided to install Supybot with the MeetBot plugin on our own server here.

I managed to get MeetBot installed (mostly – gives me an error message for every meeting command I give, but then does it anyway) and we had a very successful meeting for a class of IRC newbies: http://cs.worcester.edu/kwurst/wsu-cs401/2013/wsu-cs401.2013-02-11-21.13.html

Feb 082013
 

My CS 401 Software Development class for Spring 2013 at Worcester State University is developing an iPad app for the Worcester Art Museum (more on that in another post.)

Because few of the students have Macs the development environment was going to be a problem. There was the option of using either WSU’s or WAM’s Mac labs, but I figured that the students would want to work outside of the normal hours of the labs. Fortunately, Stoney Jackson at Western New England University suggested that I look into PhoneGap, a free and open source framework for developing cross-platform apps using the web technologies HTML, CSS, and JavaScript. PhoneGap will take a site developed using those technologies and compile it into a native app on a number of platforms, including iOS.

Even better, Adobe, which now owns PhoneGap, has set up a build server. That means that we can just have their site do the build, rather than having to rely on the few students in the class who do have Macs to do the building. To use it for free, your code does have to be in a public GitHub repo, but we were going that route anyway.

Last night I decided to do some more reading on PhoneGap, and discovered that it’s really simple to build a working Hello World app using their Getting Started documentation and GitHub respository of code. I forked the code, and submitted it to the build server, then downloaded the working app to an Android tablet. I wanted to download it onto my iPad as well, but it seems that I’ll have to go through the Apple developer provisioning setup to get a key. I’ve done that before to work on a native iOS app, but I’ve got to go dig through all my notes to get back up to speed on that process.

I decided to write up this post so that my students can see the steps I took, and get the example working on their own systems. This is pretty much just what is posted in the Getting Started page on the PhoneGap Build site.

  1. Fork the https://github.com/phonegap/phonegap-start repo. The fork button is in the upper-right hand corner of the page (https://github.com/phonegap/phonegap-start/fork_select).
  2. Go to your own GitHub page to see the repository you just created.
  3. You can clone that repo to your local machine if you want, but that is not necessary at this point, unless you want to make changes. I decided to leave making changes until later.
  4. You will need the http URI for the repository, so either copy it or leave the page/tab open.
  5. Go to the https://build.phonegap.com/ build site, and choose the Completely Free plan.
  6. Sign in through your GitHub account.
  7. Click the +new app button.
  8. Make sure you’ve clicked the open-source tab.
  9. Paste the URI from your fork of the GitHub repo.
  10. Click Ready to Build.
  11. When it’s done, click the appropriate device icon to download it to your device.

Next steps for me:

  1. Make some changes to my fork, build, and download again.
  2. Figure out the iOS provisioning so I can build and download to my iPad.

We’ll have to figure out how to set up the provisioning for the class after we determine which iPads we have available (student- and/or museum-owned) for testing.

Jun 292012
 

I purchased the book Seven Languages in Seven Weeks from The Pragmatic Bookshelf earlier this week. I’d heard about this book in multiple blogs, and the languages it covers:

  • Ruby
  • IO
  • Prolog
  • Scala
  • Erlang
  • Clojure
  • Haskell

are all “hot” languages that I thought it would be good for me to have some familiarity with. I’ve got about seven weeks left before classes begin again in September, so this seemed like the perfect time to try this.

Today’s task was to install all seven languages. I’m going to be away from the Internet at times, so I figured I had better download all the language environments and make sure they are working, then I can work on the exercises whether I have network access or not.

I’ll try to write more as I work with each language.

Apr 122012
 

Following on Trevor Hodde‘s post First Commit!, it’s time to mention our First Merge!

First Nate Doe, and then Trevor, made commits to our class’ fork of the Eutester repository on GitHub. I submitted a pull request, which has now been merged into the master branch of the repository.

We’re looking forward to more commits, and having them merged into the project. I know that Nate and Trevor have issues assigned to them, and I think that Matt Morrisey will soon.

Feb 012012
 

My Spring 2012 course is well underway (into the second week), and going well. But, as I was putting up the latest assignments and resources for the class, it struck me: I’m defaulting to CLOSED!

Many of the materials that I’m using are coming from Heidi Ellis‘ course at Western New England University, and from other open, online sources. Yet, here I am, posting them in our Blackboard CMS, where only my students have access to them! And this really is a default action — it’s just what I’m used to doing, so I’ve done it without thinking.

This is a somewhat minor at this point, since Heidi has already made these materials available. But, I’m starting to develop new exercises and assignments that others may want to use. And, just as I borrowed some of the course organization from Heidi’s course, someone else might find my “remix” of her organization useful or inspiring.

So, I’m starting to think about where I want to post my course materials to make them open. (Licensing is not the issue, as I’ve been CC-licensing my course materials for years.) Unfortunately, our school is not good about giving faculty web space that they can easily edit for themselves (our default is Microsoft SharePoint.) But, we have our own departmental server, where we are hosting our departmental blog (acting as a planet), our Git and Subversion servers, and our Wiki. I have a vestigial web site there (that just redirects to my Sharepoint page), so I can probably press that into service as the home for the course.

It may take a bit more work, because it won’t be the default. I’ll have to move all the materials I’ve already posted and remember to post the new ones I’m writing to this more open location.

And, I’ll have to work on making open my new default.