brock.hargreaves

Dec 232015
 

When I started piano lessons about 7 months ago, I started to find interesting parallels between writing code and playing piano. It shouldn’t be hard to imagine that one could be learning a programming language like a musician learns an instrument as well. There are a set of basic tools and principals. In piano, there is your basic syntax of reading music, tools such as arpeggios and scales, and principals like circle of fifths.

The first thing I did when I wanted to move into C++ development was write an open source library in C++ to better grasp the tools and principals of the language. By the time I was finished, I had exposure to a lot of the basic tools from the standard library and basic principals like type deduction, polymorphism, templates, and more “language generic” problems such as design patterns, separation of concern, clean coding, and unit testing.

As part of my journey to expose myself to as much piano as I could, I started watching a lot of youtube videos of pianists. Even if you don’t play piano or an instrument, I would suggest listening to this video, at least from minute 2:15 and onwards.

 

There is an incredible amount of parallels to learning a programming language but the following stuck with me:

Incorporating scales into your daily practice routine is the most efficient shortcut to technical mastery and brilliance, and it’s an indissoluble part of every great pianists journey.

– Ilinca Vartic

Now if Ilinca was a programmer maybe she would have worded it like this:

Incorporating functions from the standard library into your coding projects is the most efficient shortcut to technical master and brilliance, and it’s an indissoluble part of every great programmers journey.

Not only do you gain fundamental knowledge of the language you’re using, but you don’t have a write unit tests for it! At some point we all realize our wheel probably isn’t as good as the wheel from the standard library. Though there are many facets of becoming an expert at a programming language, I think having intimate knowledge of it’s standard library is important. It’s starting to become more important to me at least.

Dec 112015
 

A friend of mine who works for a seismic processing startup was chatting with me about a problem he had about with sending off commands to various machines which are supposed to process some data. Essentially wanting the ability to queue those machines for processing. Since the nodes don’t need to communicate any information, it’s not necessary to use anything fancy like MPI, though he didn’t have any built in mechanism to manage jobs. He was able to accomplish this task with a clever, though crude, combination of bash, cron, and top. Of course curiosity got the best of me wondering how this might be implemented and shortly thereafter a python cluster manager was born.

Since I’ve been focused on C++ and CUDA lately, I thought it would be a nice refresher to write a small python package which provides a more robust solution to this scheduling problem. It gave me a chance to review some simply threading and socket communications using the Python standard library along with experiment with a couple other packages. Moreover, the more I worked on it, the more I wanted to to do with it. I only wanted to dedicate one weekend of casual coding but this took about 3 such weekends to put together. I give you, clustermuster:

https://github.com/bee-rock/clustermuster

Though the server itself required ssh authentication with it’s various nodes, I haven’t included a secure authentication mechanism for sending commands to the manager apart from requiring an appropriate schema. Using it within your own network would be okay, provided you trust everyone on your network. If you plan on doing something like this in production, I’m sure there are libraries out there that would accommodate your needs. To be honest, the socket library that comes with the standard Python library is very easy to use incorrectly. Though it’s a great exercise to figure out how to use and to play with, the next time I write an application using sockets in Python, I would certainly consider using the Tornado framework, http://www.tornadoweb.org/en/stable/.

I’ll likely do a demo with it in an upcoming post.

Nov 302015
 

While cruising through talks on youtube I found one titled “I am a legend: Hacking Hearthstone with machine learning“, an interesting application of machine learning in video games:

 

 

Hearthstone is a card game in which each player has a deck of characters, the goal of which is to use those cards to do damage against your opponents. I digress, the essence his talk had two major themes: finding imbalances in cards predicting an opponents moves. The former involved deriving an over determined system and simply applying least squares, the ladder involving machine learning. I think a simple explanation of what his algorithm does is extract sequences of cards played from a rich data set:

Experimenting with various dataset sizes, I started to get consistent results when using over 10,000 game replays. Yet using more than 50,000 games does not improve the results, so I settled for a dataset of 50,000 replays that were played between May and July 2014.

Part of writing a good talk is being able to balance technical and non technical aspects of a subject while maintaining the attention of audiences whose background is varied. I think Elie did a decent job of this, though of course I would be bias and hope for a bit more depth in the machine learning.

I was a bit upset that the speakers said they were going to release their software and then didn’t, but their reasons were sensible after contacting the creators of Hearthstone. However there are parts of their software that I think they should release, such as the front end web application and back end webserver, this would allow people to write their own statistical/machine learning plugin. Providing the machine learning portion of the software is what gives users an unfair advantage, which is what the creators of Hearthstone were concerned about.

Another extremely interesting application of machine learning to video games is MarI/O , an application of creating a neural network which can pass levels in Mario:

 

 

The author of MarI/O, youtube personality SethBling (I’m not sure what his real name is), used an application of neural networks. Essentially the neural network is a mapping of the buttons you can press to move your character to blocks on the screen and then associating a distance travelled based on that mapping. If Mario makes it further in the level then a previous mapping, then his fitness goes up and he is evolving. As an aside, neural networks are the same technology that companies like Nvidia use to identify vehicles. Jonathen Cohen, former director of the deep learning department at Nvidia, describes the methodology in a short demo.

These are only some relatively recent talks on machine learning in video games, there are certainly other applications and more we’ll see in the future.

Nov 252015
 

If you are passionate about your work, your day to day duties in your job can be incredibly pleasant. I personally love being able to code daily, whether it be writing new features, bug fixing, or refactoring. However, all of those tasks become stressful when under pressure. If you’re not in a management role, then you likely have very little control on how projects are managed and how estimates are given. If your team doesn’t have a consistent metric for measuring complexity of feature requests or bug fixes, estimates can be hard if not impossible to give. There is only so many factors we can control, but part of being a professional is identifying those factors. Here is a couple that have become important to me when coding under pressure.

1. Knowing your limits

It’s easy to make mistakes when sleep deprived and mentally exhausted. Working long hours day after day can ultimately be counter productive. Make sure you find a balance or you will burn out. I’ve always found it difficult to “turn off” when a problem hasn’t been solved. Though many times it’s when I stop thinking about a problem that it magically becomes clear what needs to be done.

2. Time management

When faced with a list of things that need to be done, proper allocation of your time is important. For example, performance optimization. In general, I’m speaking of code that sits outside of the realm of high performance computing. Would you rather have something that works slowly and gives you the right answer, or works faster but gives you the wrong answer? Don’t get bogged down trying to optimise if you don’t have time. Note that optimise later doesn’t mean write clean code later, you should always strive to write clean code even if it’s not an efficient implementation.

3. Personal velocity

Part of writing software is estimating how long it will take you. Even if your team doesn’t use velocities, you can still gather data from tracking system and version control software to calculate your own personal velocity. Brian Tarbox and Heather Mardis wrote a thoughtful post about this, “Retrospective Velocity, When Will This Project Be Done?“. If your lead comes to you with a project and a deadline, you should have a quantitative way of estimating how long it will take you and inform them whether or not it is possible and find a compromise in the latter scenario.

Start small, pick something you have control of in your development environment and try to improve it. Whether it be finding a better way to juggle work and life or finding a balance between design and implementation.

Nov 182015
 

I’ve always been intrigued by computer security but I never had a lot of time to integrate the computer science and mathematics into it like the way I have with digital signal processing. In particular, I think I’d like to do a bit of research into wireless communication security. There’s no better way to learn than by doing, so I decided to experiment with Kali Linux and build my own penetration testing tablet. I was watching a Defcon video recently and someone mentioned something called a “Pwn pad”, a penetration testing tablet by PwnieExpress which comes bundled with Kali Linux Nethunter:

The Kali Linux NetHunter project is the first Open Source Android penetration testing platform for Nexus devices, created as a joint effort between the Kali community member “BinkyBear” and Offensive Security. NetHunter supports Wireless 802.11 frame injection, one-click MANA Evil Access Point setups, HID keyboard (Teensy like attacks), as well asBadUSB MITM attacks – and is built upon the sturdy shoulders of the Kali Linux distribution and toolsets.

Why buy one when you can build one yourself? After scowering the internet for instructions (of which most were outdated or just didn’t work), I managed to create a concise set which worked for me. Of course, use the following instructions at your own discretion.

Software:
Team Win Recovery Project 2.8.7.0: twrp-2.8.7.0-manta.img
Kali Linux Nethunter 2.0 for Nexus 10 (Mantaray Kitkat) image http://images.kali.org/kali_linux_nethunter_2.0_mantaray_kitkat.zip

Hardware I use:
TL-WN722N High gain 150mb/s USB wifi adapter
SENA UD100 Industrial Bluetooth USB Adapter


usb-devices

Just a few basic steps to root and flash your Google Nexus 10 to use the Kali Linux Nethunter image.

  1. Transfer the Kali Linux image to the tablet
  2. Reboot your Nexus 10 into the bootloader screen by holding down both the volume buttons and the power button at the same time (3 buttons), hold until the bootscreen shows up. plug it into your computer
  3. From your terminal:
    > sudo apt-get install android-tool-fastboot
    > sudo fastboot oem unlock
  4. Select yes on the device by toggling the volume key and pressing power
    > sudo fastboot flash recovery /path/to/twrp-2.8.7.0-manta.img
  5. Toggle the volume button to Recovery Mode and press power to select it. The TWRP screen will now load.
  6. At this point, I would make a backup through TWRP in case things don’t go as planned.
  7. Do a factory wipe through TWRP. Reboot, it will ask you if you want to install SuperSU, do so.
  8. Finish installing SU after setting up your tablet by opening the SuperSU application. I used TWRP to download and install the application, not Google play. Reboot into recovery mode, select install, and select the Kali Linx image.

These instructions are from memory. I messed around and tried a bunch of things on the internet that didn’t work since they were outdated, this should be the least amount of work to be done to get it running. Voila, here’s a picture of mine:

 

finished-product

Post install list:

  • apt-get update
  • apt-get upgrade
  • apt-get dist-upgrade

It comes with very recent versions of Python and g++ which was nice. Good luck and happy testing!

Jul 122015
 

I’m happy to announce that the first working release of spgl1++ is now available. It is based on the Matlab implementation of Michael Friedlander and Ewout van den Berg and is meant to be a standalone library which can be included in your C++ projects.

It’s still “early” in development, as such it only solves the regular basis pursuit problem with real measurement matrix A:

\displaystyle \min_{x \in \mathbb{R}^d}||x||_1  subject to Ax=b

Features I’m currently working on, which shouldn’t be far away, include solving the weighted version with denoising and complex measurement matrix \Phi:

\displaystyle \min_{x \in \mathbb{R}^d}||x||_{1,w}  subject to ||\Phi x-b||_2 < \epsilon

I’ll be tweaking it and testing various matrix vector libraries. It currently works with Armadillo, a C++ linear algebra library whose syntax (API) is deliberately similar to Matlab.

Since the goal is to accomodate general matrix vector libraries, you will have to provide spgl1++ with certain template specializations,  such as how it does matrix vector products. For example, with Armadillo:

I say “should”, since I’ve only really tested it with Armadillo and I’m sure I’ve accidentally added some dependencies in the code and will have to work on making it as general as possible. In the coming days/weeks/months I’ll be doing testing and adding documentation.

Happy sparse reconstruction!

Jul 012015
 

One of my goals when I started my website just after I finished grad school was to start sharing little snippets of code I had used or developed over the course of my undergraduate and graduate degree. One thing I did use, and still use frequently, is the Fourier transform. The Fourier transform is used in countlessly many algorithms in seismic imaging because of it’s relationship to sampling theory, ability to  identify frequency content in signals, and provide solutions to differential equations.

One day I stumbled upon a neat series of lectures by my undergraduate NSERC supervisor, Dr. Michael Lamoureux, for a Seismic Imaging Summer School in 2006. Hidden in a footnote of this set of lecture notes contains one of my favorite quotes about a mathematical theorem, namely the divergence theorem:

And, as I like to explain to my kids, the Divergence Theorem simply says you can measure how much methane gas a cow produces by either measuring how much is produced in each cubic centimeter of the cow, or simply by measuring how much leaks out its mouth and rear end, and other places on the surface of the cow. – Michael Lamoureux

However, I digress! In these lecture notes Michael describes a variety of theoretical considerations when investigating partial differential equations, and in particular the famous wave equation with initial conditions:

u_{tt} - c^2 \nabla^2 u = 0
u(x,0) = f(x)
u_{t}(x,0) = g(x)

In the third lecture of these notes, the Fourier transform is used to find a solution of the equation. This is also given as an example in Elias Stein’s famous book on harmonic analysis (though I have it written here as Page 395, I’m not sure which volume. It’s likely volume 1). In this case, the Fourier transform of the solution can be written as the sum of sines and cosines:

\hat{u}(k,t) = \hat{f}(k)cos(2\pi c|k|t) + \hat{g}(k)\frac{sin(2\pi c|k|)}{2\pi c|k|}

In the future, I might include the derivation though for now this will be sufficient. We simply apply the inverse Fourier transform to obtain a solution. In 2009 or so, I wrote a little Matlab code to demonstrate this solution. There are some scaling factors to match that of the Fourier transform implementation of Matlab. Luckily I’m a bit of a hoarder when it comes to old school work so here it is:

To be honest, there are a lot of poor programming practices going on here. For example when I wrote this, prior to my introduction into software development, I probably gave myself a nice little pat on the back for putting in those comments. Those who employ clean coding practices see where I’m going with this. If you take a look through the code, every time there is a comment it could be naturally refactored into a function with a descriptive name. Moreover, who am I trying to kid with variable names like “W”! No one would quickly realize that I’m referring to a grid of frequencies. Though it is academic code in nature, it’s still embarrassing!

Moving forward, we can then use this function to plot the solution for a couple different initial conditions:

Here are two sample videos:

Notice that we see some funky stuff happening at the boundaries. The mathematical solution given is valid on the entire plane, so it should just disappear off our screen. The fact that there is reflections at the boundary is an artifact of the discrete nature of computational mathematics, and these are typically handled using things like absorbing boundary conditions or perfectly matched layers. The easiest, but most computational inefficient way of handling it, would be to make an even larger grid and only record for the amount of time it takes for the reflection to reach the area you are recording.

Jun 202015
 

I have been using a library called libunittest for unit testing an open source project. A couple months ago I did a feature request for it, namely for an assertion for relative approximation. It’s an incredibly easy library to use and comes with a lot of documentation. Here is one of the examples which shows the “easy” way where you don’t have to register the test class:

After installing, link against it and compile, then execute:

This example and others are covered in the tutorial provided on the website: http://libunittest.sourceforge.net/tutorial.html

I personally like a different style that libunittest supports which is a bit more writing of code but I believe it’s a nice way of organizing a test suite:

For learning about the other types of assertions you can do, you may want to peruse the testing code:

http://sourceforge.net/p/libunittest/code/ci/master/tree/test/test_assertions.cpp

Happy testing!

May 032015
 

I had a problem this weekend where the following snippet would compile within Eclipse, however the IDE would still complain with things like “Symbol ‘array’ could not be resolved”:

There were lots of people who were able to fix this problem by simply adding the appropriate paths, but since it compiles in my case this wasn’t the solution. Anyway, some genius on Stackoverflow figured it out: http://stackoverflow.com/questions/17131744/eclipse-cdt-indexer-does-not-know-c11-containers

Add the symbol with name “__cplusplus” (which in this case is apparently an override):

__cplusplus

with the following value:

201103L

Now use “Run C/C++ Code Analysis and the red underlining from the unresolved imports goes away:

There is still that red on the left hand side which is just highlighting of the scope which bothers me. This can be removed by right clicking on the red portion, going to preferences, text editors, and removing the “Show range indicator.” check mark:

Finally something nice looking:

Note that for this example I still needed to ensure I was using the C++11 Language Standard:

Apr 042015
 

In the summer of 2013 I had a fantastic internship in Houston, Texas with an oil and gas company called Total. During graduate school my research group would have consortium meetings, where all of our sponsors would send representatives to learn about our research and how it could impact the way they do seismic imaging. After I gave a talk at one of these meetings, I was approached by a research geophysicist who would eventually become my mentor for my internship. I immediately brought this to the attention of my supervisor who thought it would be a great experience for me.

When I arrived in Houston I began working on taking a Matlab library and rewriting it in Fortran. It was certainly challenging, but I felt like I was constantly progressing. This was an extremely good feeling, as I would struggle with math problems for weeks on end and get no where. Each week I would finish a different unit of my project and at the end I felt quite accomplished. Both myself and my mentor were happy with what I had accomplished with my time at Total. I took this renewed energy back to Vancouver and finished my thesis!

However, knowing what I know now, there are plenty of things that I would do differently in terms of development I did during my internship:

  1. Commit code more often than not. I remember there was this one particular Friday where I started making changes to some code and saved it locally, but then Eclipse crashed and I couldn’t “undo” my changes back to something that I knew worked. I ended up checking out the last revision because something wouldn’t compile anymore and I couldn’t figure out why.
  2. Write more unit and integration tests! I did do system tests to make sure the output I got from MATLAB was the same that I got in Fortran. I knew very little about testing at that time and I wouldn’t have even known what to look for in terms of a Fortran unit testing framework.
  3. Use better design patterns. I ended up having to duplicate code to make everything work for complex variables. Had I used either better abstraction/design or even templates in C++, this could have been avoided.

I don’t “struggle” with the above 3 anymore, but rather they are something I constantly try to think about while I develop. If I’m on my own branch I even feel comfortable committing code that doesn’t compile or where tests fail, I can always go backwards in history if I need to but if I lose code then it’s gone forever. What I do struggle with is knowing the appropriate designs and sometimes the appropriate way to mock parts of a system that I can’t control. But that comes with practice 🙂