Sunday, November 23, 2014

Blog Reboot

I've been away for far too long, which isn't to say that nothing has been happening.  In fact, it's an over-abundance of work that is the reason that I've been away.

I was enrolled in a 4 course progression for Android programming on Coursera that I would highly recommend for anyone who has the time.    I had completed the first course, and was starting the second when I had to step away because things at work heated up.  It would be very nice to return to this some day.

I'm currently pursuing some ideas using using the IBM Bluemix environment.  Now that the Watson API has been published, and the environment has been opened up to the public, it should be possible to do something interesting.  There is a lot to learn here, but I can do this at my own pace.  I hope I can move fast enough for my work to be interesting to others.

The Raspberry PIs are home from Bard, and aren't doing anything.  I like this environment a lot, and hope that we can figure out something useful to do with them in the future.

Wednesday, November 13, 2013

My Raspberry Pi Bramble Goes to College

I am proud to have a son who is a freshman at Bard college and who is taking Operating Systems (Comp Sci 326).  The professor teaches this class using Raspberry Pis, and it is really cool to see what kinds or projects they are doing.  Everyone has to do a class project, and my son has decided to do a parallel merge sort using the Raspberry Pi bramble using some kind of either divide-and-conquer, or master-slave architecture.  He hasn't decided yet how to arrange things.

I think it's more fun to watch him consider all of the pros and cons of different designs than try to sort it out myself.  My original goal for the bramble was to create a parallel environment to perform computations for my neural networks course (Coursera, "Neural Networks for Machine Learning").  However, we did everything in octave in a single Linux image, and it was really fast.  I coudln't figure out a better way to architect the computational environment.  This was my first experience with a numerical methods package, and I was duly humbled as a nube (my expertise is in operating systems and middleware).

My next idea was to simply run the World Community Grid on each node to create an array of individual elements that were admittedly slow, but which might perform respectably in aggregate.  Enter my son, and his idea for parallel merge sorting.  I'm going to get the bramble back at the end of the semester, and we'll play with it then to see what is possible from a WCG standpoint.  If I'm lucky, my son will have other ideas that will take us for another adventure in computing.

Friday, November 8, 2013

Virtual Machines and How I Use Them

Virtual Machines are nothing new.  I provisioned my first KVM on a Linux machine probably 5 or 6 years ago, and it  didn't go very well.  You really had to understand how a collection of different parts fit together, and then have some luck to make it all work at once.  My biggest problem was network settings - nothing ever seemed to work right.

Things really got better a couple of years ago.  I started with VMWare, and was impressed with how well it all worked, but was really disappointed to realize that I couldn't provision my VM on one physical device, and then run it on another unless I bought a license.  I settled on VirtualBox, because I need to run the same VM on either a Linux or a Windows host.  I'm very impressed with how well things work - from sharing drives and clipboards to the ease with which networking and printing is handled.  Just make sure you have your VMs set up to build and use the VirtualBox guest additions.

I now walk my VM on a passport drive between my home and work machines, and things run flawlessly.  My work requires my hard drive to be encrypted, which is a good thing, because in the event that that my drive is lost or stolen, the data on it is safe.  I also get multiple copies of my environment - one on each physical machine, and one on the backup drive.  Having suffered a head crash on my work machine before and dealing with the re-building effort, I sleep well knowing that I have redundant versions of my machine.

In a similar way, I use a VM for all of my MOOCs.  I've built up a set of notes and projects for 3 courses now, and this information is precious to me.  My VM is a container for all of my things, and I don't worry about losing it.

In a few weeks I'm attending a hands-on lab to install and configure an OpenStack environment (http://www.meetup.com/OpenStack-New-York-Meetup/events/144883832/).  This requires two VMs so that we can simulate working in a multi-node environment, all within a single laptop.  So not only are VMs convenient and secure when encrypted, they allow you do do things that just can't practically be done with physical hardware.  VMs have really become an essential part of my computing life, and I can't imagine working without them.


Saturday, October 26, 2013

Hello Again ...

I have to admit that my Raspberry Pi bramble has been gathering some dust lately, primarily because I've been off taking classes again.  I've spent a lot of tie on two very good classes that I would recommend to anyone:
  • CS188.1x - Artificial Intelligence through edX.  The class is taught by Dan Pieter from Stanford.  It's an excellent class.  I enjoyed it a great deal, and learned a lot of things about machine learning that I just didn't pick up from the vision-based classes that I've had before.
  • Coding the Matrix: Linear Algebra Through Computer Science Applications.  This class is taught by Philip Klein at Brown University, and it's offered through Coursera.  It is another fine course.  I took this for two reasons - 1) I never really learned linear algebra the way I needed to as an undergrad, and 2) it was all in Python.  This is my 3rd course now that is Python based, and I can now say that I'm reasonably competent at it as a language.
I've met the minimum requirements in all 3 of my classes so far to get a certificate in each.  It's hard though when things get busy not to abandon a class because there isn't enough time. I've come to the realization that I probably won't do a lot more than meet the minimum requirements in every class I take, even though I would like to spend the time to get high grades in each class.  This doesn't diminish the value I see in taking each class.  All 3 of my classes have been great ways to sharpen my skills for the next steps that I want to take in my career.

Tuesday, January 1, 2013

Orchestrating the Nodes of a Bramble

One of the things I've been avoiding is creating a set of tools to manage the nodes of this bramble, but it's really the best thing to do in the long run.  I just didn't want to get side tracked with a programming task that isn't directly related to getting this bramble to work in a parallel fashion.

Although its mixing metaphors pretty badly, what we need a Maestro - a conductor who can get all of the nodes to do the same thing at the same time, much like an orchestra.  One has to wonder though what an orchestra characterized as a bramble would sound like.

At any rate, I've created a kit of primitive tools that help a lot in getting the final configuration together for each of the nodes in the bramble.  They are basically a set of shell scripts that allow you to distribute work to each node in a SIMD fashion.  This makes doing the same thing in several places at once much easier.

I'm planning to put this toolkit out for everyone to use, but I have to clear it with my employer, who technically owns everything in my head.  Watch this space.

Saturday, December 15, 2012

Success Through a Failed Attempt to Set up an Xserver Environment

When you rsh into a raspberry Pi, the splash screen includes the following line:

   Type 'startx' to launch a graphical session

I've logged into headless machines before and set up the DISPLAY variable so that I could start an Xserver and have all of the graphical output directed to a machine with a real display device.  This is pretty vanilla Unix stuff, but it led to some other mildly interesting discoveries about my particular VM-based environment.

When I did an ifconfig, it revealed that my IP address was 192.168.91.143.  I was on a different subnet than the rest of my network.  I could ping my Ubuntu guest in at this IP address from my Windows 7 base environment, but the response times were very sluggish - ranging from 500 MS to well over a second.  Nothing actually timed out, but it wasn't optimal.  I could also ping my PIs from Windows 7, and things there seemed pretty normal.

For some reason though, I couldn't reach my Ubuntu guest from the PIs.  This means that I wouldn't be able to use this guess as an X client, and have it receive graphical sessions from the PIs.  Why I could go the other direction from Ubuntu to the PIs is beyond me.  There are network details here that I don't understand, and don't really plan to find out.

It then occurred to me that these behaviors are probably the result of the default network connection that VMware and most other give you - Network Address Translation (NAT).  This maps the IP address of the guest system to the underlying IP address of the host machine so that they share an IP address as far as the external network sees them.  Presumably this means that somewhere in the stack, my 192.168.91.143 address that Ubuntu sees, gets mapped to the 192.168.1.102 address of my Windows 7 environment.  All of this address translation is probably why performance suffers when I Ping Ubuntu from Windows.

I tried switching to a bridged configuration to see what would happen.  I've had bad experiences with this in the past under KVM and VirtualBox - both of which not only didn't work properly, but they left my guest system flopping around like a fish out of water.

This time however, everything worked as I had expected.  My IP address changed to 192.168.1.118, so I was on the same subnet as everyone else.  I could reach my Guest system from the PIs, and the apparent performance issues I had between Windows and Ubuntu vanished.  VMware was solid here, and life is good - almost.

I exported my DISPLAY variable from one of the PIs to my new Ubuntu IP address, and attempted to start an Xserver via startx.  It still didn't work.  At this point, I've decided to put this effort on the shelf.  It isn't strictly needed for what I want to do right now.  This exercise was a fruitful one though, because I'm not running a bridged connection that I think will give me more of what I want in terms of performance and configuration.

I'll be back to this later though.

Monday, December 10, 2012

Back to Bramble Setup

Ok, class finished about a week ago, and it was a great experience.  I'm looking forward to taking another course in the future, but for now it's time to get back to the Pi bramble.  Once I get the setup completed, there are some interesting things that I'm going to try to drag over from my Neural Networks environment to the bramble.

Just to level-set, I've got the message passing interface (mpich2) installed and built on all of the Pi machines.  I see that there is now an mpich version 3 available as of mid-November, but I'm staying put on this version until we get things up and running.

The next step is now to configure the network of machines.  I established machine number 1 as the master, and the rest as slave machines.  Follow the directions to setup the rhosts files for the root and pi users, and /etc/hosts so that all of the machines can see each other by name (see the instructions in the west coast labs blog).  My /etc/hosts on each machine now contains these lines (among others):

192.168.1.151    RsPi1    RsPi_1    R1    r1    Master     master
192.168.1.152    RsPi2    RsPi_2    R2    r2    SlaveR2    slaver2    slave2
192.168.1.153    RsPi3    RsPi_3    R3    r3    SlaveR3    slaver3    slave3
192.168.1.154    RsPi4    RsPi_4    R4    r4    SlaveR4    slaver4    slave4
192.168.1.155    RsPi5    RsPi_5    R5    r5    SlaveR5    slaver5    slave5
192.168.1.156    RsPi6    RsPi_6    R6    r6    SlaveR6    slaver6    slave6
192.168.1.157    RsPi7    RsPi_7    R7    r7    Slaver7    slaver7    slave7
192.168.1.158    RsPi8    RsPi_8    R8    r8    Slaver8    slaver8    slave8


This is replicated on all of the Pi machines.  It's as far as I've been able to get with the setup so far.  I'm getting close to running something soon.

Tuesday, November 13, 2012

Been away for a while

My Neural networks class has ramped up fully, and it's been soaking up a lot of my time.  It's a great class, taught by a great professor - Geoffrey Hinton from the University of Toronto.  I've finally gotten to a lull in the course work, so it's time to get back to the Pis to complete the Bramble.

There is a greater plan, which I have mentioned before - once the neural networks class is over, I'm going to try to map this to the bramble.  This may be a fool's errand, but it just might work too.  Either way, it will be an adventure, and may lead to something useful.

Monday, October 8, 2012

Setting up the Message Passing Interface (MPI)

After configuring SSH with public/private key pairs, and setting up aliases for in /etc/hosts for each PI node, one more useful thing is to update the command prompt to make it easy to know which machine you are working with.  I altered the .bashrc profile on each machine like this:

# ~/.bashrc: executed by bash(1) for non-login shells.
# see /usr/share/doc/bash/examples/startup-files (in the package bash-doc)
# for examples

# JB - Put the machine name in the prompt to make it easier to know which
# machine we are running on.
export PI_NODE=R1


...

if [ "$color_prompt" = yes ]; then
    PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@$PI_NODE\[\033[00m\] \[\033[01;34m\]\w \$\[\033[00m\] '
else
    PS1='${debian_chroot:+($debian_chroot)}\u@$PI_NODE:\w\$ '
fi

...


Not only does this make the node name apparent in the prompt, it gives us a handle on each machine to use if we want to write tools:

pi@R1 ~ $ echo $PI_NODE
R1
pi@R1 ~ $


Now it's time to install the message passing software - MPI-2.  I'm going to build this form the source, just to get the latest version.  Thanks to Phil Leonard for the pointers to everything.
  1. sudo apt-get install fort77
  2. wget http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/1.4.1p1/mpich2-1.4.1p1.tar.gz
  3. tar zxfv mpich2-1.4.1p1.tar.gz
  4. cd mpich2-1.4.1p1
  5. sudo ./configure
  6. sudo make
  7. sudo make install  
Step 1 was to install the Fortran compiler (fort77), which is no big deal.  Then the rest of it built and installed without any problem.  As Phil mentions in his blog (install option 3), this takes a while - at least a couple of hours, so you may want to plan accordingly.

I got this to build and install properly on my first Pi node, so now I'm off to replicate this on the other machines.  I can install of these machines in parallel of course, but it will still take some time.

Wednesday, October 3, 2012

My First MOOC Class - Neural Nets

Yesterday I took my first online class in the neural networks class offered by Coursera.  I took a set of machine learning classes when I was working on my masters degree at Rensselaer, and I always found neural networks interesting.  Unfortunately our course material never required us to actually create a network, so neural nets were never more than an academic interest for me.

Fortunately, I recognized the name of the instructor for this course when I read the description.  Geoffrey Hinton is a leading name in the area of neural networks, and he was referenced many times in the readings that I had for my masters classes years ago.  This is my chance to actually lay hands on a neural network, and learn from a leading thinker in the area.

My experience with Massively Open Online Courses (MOOCs) is pretty limited right now, but my first impression is positive.  The Coursera site is well constructed and easy to use.  I would recommend it to anyone thinking about taking a class.

The long term goal here is to really learn how to implement neural networks, and then move them onto the Raspberry Pi bramble I'm in the middle of creating.  These networks are inherently parallel beasts, so it should be an interesting exercise.