### Android image-labelling app

Need to label a large set of images for training a classifier?  Try this app:

The app takes a URL pointing to an Apache-style index of your image dataset. Class types are user-configured. Results are emailed to the user.

The app is targeted at neural-network/machine learning developers. It’s been tested with .bmp and .jpg images, and should also work with .png/.gif types.

As an alternative to directly serving an image dataset in a public Apache directory, the images can be placed in a public Dropbox folder with an accompanying manually-constructed index.html file (example to come soon).

### Traffic turn count apps

Here’s a free web-based app for traffic turn/movement counts:

http://freetrafficcount.com/

And here’s a native Android app:

### Requirements for a practical image segmentation algorithm

Locating individual objects in image or video requires segmentation: the task of identifying whether a particular pixel comes from some object of interest or the background.

Say, for example, we’re looking at an image of traffic and we want to count vehicles.

The ideal results of segmenting individual vehicles in the above image would be something like this:

Dotted lines represent inferred outlines. The important attribute is that overlapping (occluding) vehicles are outlined separately rather than being lumped together.

The problem is that simple background-subtraction blob detection methods result in segmentation more like this:

*LIDAR, stereo-vision and other 3D-sensing technologies might be helpful for segmentation, but I’m focusing on cheap camera hardware in this application.

So, how do we design an algorithm to transform the inferior background-subtraction-based blobs/outlines into the ideal, vehicle-separating outlines?

One requirement is that our algorithm must have prior knowledge of vehicle appearances. It’s probably not possible to perform segmentation on a single image without having been exposed to many vehicles/training examples in the past. Otherwise, it would not be possible to infer the outline of obscured objects.

On the other hand, it’s obviously not possible to store examples of every vehicle variation in existence. Some middle ground must be achieved where a representative basis set is learned from training data, sufficient to represent actual objects fairly accurately but not burdensome on computer resources.

So, assuming we have some basis set with which to represent the objects being segmented, the next question is how we deconstruct the image into elements generated from the basis set.

For now, let’s ignore the case where a single object is segmented into multiple blobs by background subtraction. So, we’re considering the cases where each blob is either a) a single vehicle or b) some small quantity of vehicles.

How do we identify the quantity of vehicles in each blob? One approach is a direct neural-network classifier, trained on hand-classified examples of one, two, three (etc) vehicles. This emulates the cognitive process known as subitizing. The results of subitizing provide an object count for each blob. This can be used to select the number of predicted objects from our object basis set.

The final task is locally fitting the basis-set elements to each blob. This would depend on the method of representation, but we can make some assumptions. Translational brute-force testing by sliding an image prototype horizontally and vertically across the image to maximize similarity could be done in real-time if it’s over a small enough area. Z-ordering can be tested exhaustively as well. But how do we draw the actual basis-set examples to be adjusted/translated? If we’re using RBMs as the representative set, we could use activation energies to guide selection. If we’re storing raw images for a nearest-neighbor type approach, we could compute the distance to each example image using some sparse feature set.

Here’s the overall architecture:

1. Background modeling
2. Background subtraction
3. Blob identification
4. Subitizing
5. Local template-matching

### Generating images of traffic with Restricted Boltzmann Machines

As part of a traffic object classification system, Restricted Boltzmann Machines (RBMs) can be used to generate images of traffic. This post isn’t meant to explain why it’s helpful to generate images, just how the generation procedure itself works.

### A generated image

A single RBM takes a binary input vector and produces a single activation probability. It can also be used in reverse; going from an activation state to an input probability. An extension for continuous input data (known as a cRBM) is necessary for RGB images as used in this example.

The activation probability p for a single RBM is calculated by multiplying the input vector b by the RBM’s weight vector w and taking the logistic function of the result:
$p = \sigma ( \mathbf{b} \cdot \mathbf{w})$

where σ is the logistic function:
$\sigma = \frac{1}{1 + e^{-k(x-x_{0})}}$

A single RBM is completely represented by its weight vector w. In practice, an array of RBMs is trained in order to reproduce some training set. Typically, a much smaller quantity of RBMs is trained relative to the number of training images. If the RBMs can learn an efficient representation of the input samples, an image can be reproduced by calculating the response for each RBM in the sets and re-sampling the inputs based on RBM activation. The state of RBM activations in response to a single image can be viewed as a compression of the input image. A 30×30 RGB image (a 2700-length vector) is converted to a hidden-layer activation vector of length 50, in the case of a 50-hidden unit RBM set.

See here for an introduction to RBMs in general:
Introduction to Restricted Boltzmann Machines

And a more detailed guide here:
A Practical Guide to Training Restricted Boltzmann Machines

By setting an RBM’s activation state to ‘on’ and rendering its input probabilities as an image, it’s possible to visualize the RBM’s response.

### 50 RBM Weight Visualizations

A set of 50 hidden-state RBMs was trained on 967 training images for 2 hours on an i5. Here are the full set of weight visualizations after training:

Here is a sample of the training images that were used to calculate the weights above:

And here is a sample of reconstructed input images:

Sometimes it’s helpful to see the RBM’s reconstruction of a single image. Here is a set of inputs with their corresponding reconstructions:

The reconstructions aren’t particularly accurate but they’re good enough for our purposes. Ultimately, the RBM array’s hidden activations will be used as inputs to a neural network for classification. The important thing is that the RBM reconstruction retains enough information to separate cars, background, trucks and so on. Rendering the RBM array weights and reconstructions mostly serves as a sanity check that the RBM’s hidden activations are a good enough compression of the input space to distinguish different object types.

### What causes RBMs to reconstruct the average of the training set?

In testing an implementation of a continuous RBM (Restricted Boltzmann Model), certain data sets cause the RBM to learn the average of each image rather than isolating any features. In the following cases, the RBM was initialized with 5 hidden units, learning rate of 0.01, 5000-10000 training iterations. The results are similar throughout a fairly wide range in parameters (4-10 hidden units, 1000-100,000 iterations, learning rates from 0.1 to 0.001).

In the first example, RBM training has failed and led to identical weights/features. Reconstruction shows that the RBM just reproduces an average of the inputs.

If the training set is altered by adding cases where each feature is present by itself, this issue disappears:

The implementation in C# is below for reference. It’s hard to tell whether this is a bug in my implementation, something related to the way the features appear in the training sets, or just inherent behavior of RBMs.

It appears that other people have encountered this issue as well:
http://stats.stackexchange.com/questions/115946/what-enforces-features-diversity-in-rbm

http://www.quora.com/I-am-training-a-stacked-RBM-on-the-Olivetti-dataset-faces-to-reduce-the-dimensionality-When-I-put-different-faces-through-the-network-I-get-roughly-the-same-result-and-reconstructing-always-results-in-the-average-face-of-the-training-What-am-I-doing-wrong

 public class RBM { public double[][] Weights; public double L; public double N;

 public double[][] TrainingSet; public int NumberOfCycles; [NonSerialized] private Thread _trainingThread = null; [NonSerialized] private bool stopTraining = false; private Random r; /// /// Single-layer Continuous Restricted Boltzmann Machine /// /// Input vector size /// Hidden binary units /// Weight update multipler /// Noise magnitude in reconstruction public RBM(int dataLength, int numHiddenUnits, double learningRate, double noiseMagnitude) { r = new Random(); L = learningRate; N = noiseMagnitude; InitWeights(dataLength, numHiddenUnits); } public void Train(double[][] trainingSet, int numberOfCycles) { for (var i = 0; i < numberOfCycles; i++) { var index = i%trainingSet.Length; var input = trainingSet[index]; UpdateWeights(input); } } public void TrainWorker() { var i = 0; while(true) { var index = (i++) % TrainingSet.Length; var input = TrainingSet[index]; UpdateWeights(input); if(stopTraining) _trainingThread.Abort(); } } public void TrainMultithreaded(double[][] trainingSet, int numberOfCycles) { TrainingSet = trainingSet; NumberOfCycles = numberOfCycles; stopTraining = false; _trainingThread = new Thread(new ThreadStart(TrainWorker)); _trainingThread.Start(); } public void StopTrainMultithreaded() { stopTraining = true; } public void InitWeights(int dataLength, int numHiddenUnits) { Weights = new double[numHiddenUnits][]; for (var i = 0; i < numHiddenUnits; i++) { Weights[i] = new double[dataLength]; for (var j = 0; j < dataLength; j++) { Weights[i][j] = r.NextDouble()/10; } } } private double ActivationEnergy(double[] input, int hiddenUnitNumber) { double sum = 0; for (var i = 0; i < Weights[hiddenUnitNumber].Length; i++) { sum += input[i]*Weights[hiddenUnitNumber][i]; } return sum; } private double Sigma(double activationEnergy) { var sigma = 1/(1 + Math.Exp(-activationEnergy)); return sigma; } public bool[] ComputeActivations(double[] input) { var activations = new bool[Weights.Length]; for (var i = 0; i < Weights.Length; i++) { var p = Sigma(ActivationEnergy(input, i)); activations[i] = (r.NextDouble() 0.5) && activations[i]; } return activations; } public double[] ComputeActivationsExact(double[] input) { var activations = new double[Weights.Length]; for (var i = 0; i < Weights.Length; i++) { var p = Sigma(ActivationEnergy(input, i)); activations[i] = p; } return activations; } private double[][] ComputeAgreement(double[] input, bool[] activations) { var agreement = new double[activations.Length][]; for (var i = 0; i < activations.Length; i++) { var a = new double[input.Length]; for (var j = 0; j < input.Length; j++) { a[j] = activations[i] ? input[j] : 0; } agreement[i] = a; } return agreement; } public double[] ReconstructExact(double[] activations) { var reconstruction = new double[Weights[0].Length]; for (var i = 0; i t*Weights[j][i] ).Sum(); var p = Sigma(sum); reconstruction[i] = p; } return reconstruction; } public double[] Reconstruct(bool[] activations) { var reconstruction = new double[Weights[0].Length]; for (var i = 0; i t ? Weights[j][i] : 0).Sum(); var p = Sigma(sum); reconstruction[i] = p + N * (r.NextDouble() - 0.5); } return reconstruction; } private void UpdateWeights(double[] input) { var activations = ComputeActivations(input); var ePositive = ComputeAgreement(input, activations); var reconstruction = Reconstruct(activations); var ractivations = ComputeActivations(reconstruction); var eNegative = ComputeAgreement(reconstruction, ractivations); for(var i=0; i<Weights.Length; i++) for (var j = 0; j < Weights[i].Length; j++) { Weights[i][j] = Weights[i][j] + L*(ePositive[i][j] - eNegative[i][j]); } } 

 }

### Logging network connectivity with a Windows batch file

This is a method of periodically logging network connectivity using a simple batch file. The batch file pings google and counts the number of lost responses. The output file, connectivity.log can be analyzed for uptime measurements.

ctest.bat file contents:

ping www.google.com | findstr "Lost" > num_lost_string
FOR /F "tokens=10" %%f IN (num_lost_string) DO  @echo %%f > num_lost
findstr "0" num_lost
IF ERRORLEVEL 1 (
echo|set /p=Down %date% %time%, >> connectivity.log
type num_lost >> connectivity.log
) ELSE (
echo|set /p=Up %date% %time%, >> connectivity.log
type num_lost >> connectivity.log
)
del num_lost_string
del num_lost

Set the batch file to run every few minutes using Windows Task Scheduler UI.

### Cosets, Abelian groups and Pauli matrices

Cayley graph of the 16-element Pauli group, using three generators. Posted by user Maproom on Wikipedia.

The Group.rb class is updated with functions to calculate subgroup cosets and check commutativity. The attached test set uses Pauli matrices as an example of a non-commutative/non-Abelian group.

Group.rb:

class Group
#Alexander Farley, March 2014
#
#This class can be used to verify and compute basic properties of finite discrete groups.
#Useage:
#  sign = [-1,1]
#  mult = Proc.new{ |a,b| a*b }
#  mult_sign = Group.new(sign, mult)
#  mult_sign.isGroup? => true
#

require 'set'

def initialize(set, block)
@set = set
@block = block
end

def isGroup?
has_identity = self.has_identity?
is_associative = self.is_associative?
is_closed = self.is_closed?
has_inverses = self.has_inverses?
isGroup = has_identity && is_associative && is_closed && has_inverses
if not has_identity
p 'missing identity'
end
if not is_associative
p 'not associative'
end
if not is_closed
p 'not closed'
end
if not has_inverses
p 'missing inverse'
end
return isGroup
end

def is_closed?
is_closed = true
@set.each do |element|
@set.each do |other_element|
next_element = @block.call(element, other_element)
if not @set.include?(next_element)
is_closed = false
end
end
end
return is_closed
end

def has_identity?
@set.each do |element|
candidate_identity = true
@set.each do |other_element|
next_element = @block.call(element, other_element)
if not next_element == other_element
candidate_identity = false
end
end
if candidate_identity == true
return true
end
end
return false
end

def identity
has_identity = false
@set.each do |element|
candidate_identity = true
@set.each do |other_element|
next_element = @block.call(element, other_element)
if not next_element == other_element
candidate_identity = false
end
end
if candidate_identity == true
return element
end
end
puts 'No identity exists'
return nil
end

def is_associative?
is_associative = true
@set.each do |a|
@set.each do |b|
@set.each do |c|
element_ab_c = @block.call(@block.call(a, b), c)
element_a_bc = @block.call(a, @block.call(b,c))
if not element_ab_c == element_a_bc
is_associative = false
end
end
end
end
return is_associative
end

def has_inverses?
has_inverses = true
id = self.identity
@set.each do |element|
found_inverse = false
@set.each do |other_element|
next_element = @block.call(element, other_element)
if next_element == id
found_inverse = true
end
end
if found_inverse == false
has_inverses = false
end
end
return has_inverses
end

def lcoset(subgroup,group_element)
lcoset = Array.new
subgroup.each do |subgroup_element|
result = @block.call group_element, subgroup_element
if not lcoset.include? result
lcoset << result
end
end
return lcoset
end

def rcoset(subgroup,group_element)
rcoset = Array.new
subgroup.each do |subgroup_element|
result = @block.call subgroup_element, group_element
if not rcoset.include? result
rcoset << result
end
end
return rcoset
end

def lcosets(subgroup)
lcosets = []
@set.each do |element|
lcoset = self.lcoset subgroup, element
if not lcosets.include? lcoset.to_set
lcosets << lcoset.to_set
end
end
return lcosets
end

def rcosets(subgroup)
rcosets = []
@set.each do |element|
rcoset = self.rcoset subgroup, element
if not rcosets.include? rcoset.to_set
rcosets << rcoset.to_set
end
end
return rcosets
end

def isAbelian?
isAbelian = true
@set.each do |element_a|
@set.each do |element_b|
result_ab = @block.call element_a, element_b
result_ba = @block.call element_b, element_a
if not result_ab == result_ba
isAbelian = false
end
end
end
return isAbelian
end

end


test_group.rb:

require './group'
require 'test/unit'

class TestGroup < Test::Unit::TestCase

int_mod10 = [0,1,2,3,4,5,6,7,8,9]
add_mod10 = Proc.new{ |a,b| (a+b)%10 }
end

def test_mult_int_mod10_isNotGroup
int_mod10 = [0,1,2,3,4,5,6,7,8,9]
mult = Proc.new{ |a,b| a*b }
mult_int_mod10 = Group.new(int_mod10, mult)
assert_equal false, mult_int_mod10.isGroup?
end

bool = [0,1]
add = Proc.new{ |a,b| a+b }
end

bool = [0,1]
add_mod2 = Proc.new{ |a,b| (a+b)%2 }
end

def test_mult_sign_isGroup
sign = [-1,1]
mult = Proc.new{ |a,b| a*b }
mult_sign = Group.new(sign, mult)
assert_equal true, mult_sign.isGroup?
end

def test_lcoset_of_sign_one_is_one #From Wikipedia cosets example, http://en.wikipedia.org/wiki/Coset
sign = [-1,1]
one = [1]
mult = Proc.new{ |a,b| a*b }
mult_sign = Group.new(sign, mult)
assert_equal one, mult_sign.lcoset(one,1) #Note that the single group element (g in gH) should not be an array
end

def test_lcosets_of_add_mod8 #From Wikipedia cosets example, http://en.wikipedia.org/wiki/Coset
int_mod8 = [0, 1, 2, 3, 4, 5, 6, 7]
sg = [0,4]
add_mod8 = Proc.new{ |a,b| (a+b)%8 }

assert_equal [[0,4].to_set, [1,5].to_set, [2,6].to_set, [3,7].to_set], add_int_mod8.lcosets(sg)
end

def test_rcosets_of_add_mod8 #From Wikipedia cosets example, http://en.wikipedia.org/wiki/Coset
int_mod8 = [0, 1, 2, 3, 4, 5, 6, 7]
sg = [0,4]
add_mod8 = Proc.new{ |a,b| (a+b)%8 }

assert_equal [[0,4].to_set, [1,5].to_set, [2,6].to_set, [3,7].to_set], add_int_mod8.rcosets(sg)
end

int_mod8 = [0, 1, 2, 3, 4, 5, 6, 7]
add_mod8 = Proc.new{ |a,b| (a+b)%8 }

end

def test_mult_pauli_isNotAbelian #http://en.wikipedia.org/wiki/Pauli_group
require 'matrix'
id = Matrix[ [Complex(1), Complex(0)], [Complex(0), Complex(1)]]; id_neg = -1*id; id_i = Complex(0,1)*id; id_negi = Complex(0,-1)*id
p1 = Matrix[ [Complex(0), Complex(1)], [Complex(1), Complex(0)]]; p1_neg = -1*p1; p1_i = Complex(0,1)*p1; p1_negi = Complex(0,-1)*p1
p2 = Matrix[ [Complex(0), Complex(0,-1)], [Complex(0,1), Complex(0)]]; p2_neg = -1*p2; p2_i = Complex(0,1)*p2; p2_negi = Complex(0,-1)*p2
p3 = Matrix[ [Complex(1), Complex(0)], [Complex(0), Complex(-1)]]; p3_neg = -1*p3; p3_i = Complex(0,1)*p3; p3_negi = Complex(0,-1)*p3
pauli_matrices = [id, id_neg, id_i, id_negi, p1, p1_neg, p1_i, p1_negi, p2, p2_neg, p2_i, p2_negi, p3, p3_neg, p3_i, p3_negi]

mult = Proc.new{ |a,b| a*b }
mult_pauli_matrices = Group.new(pauli_matrices, mult)
assert_equal false, mult_pauli_matrices.isAbelian?
end

def test_mult_pauli_isGroup #http://en.wikipedia.org/wiki/Pauli_group
require 'matrix'
id = Matrix[ [Complex(1), Complex(0)], [Complex(0), Complex(1)]]; id_neg = -1*id; id_i = Complex(0,1)*id; id_negi = Complex(0,-1)*id
p1 = Matrix[ [Complex(0), Complex(1)], [Complex(1), Complex(0)]]; p1_neg = -1*p1; p1_i = Complex(0,1)*p1; p1_negi = Complex(0,-1)*p1
p2 = Matrix[ [Complex(0), Complex(0,-1)], [Complex(0,1), Complex(0)]]; p2_neg = -1*p2; p2_i = Complex(0,1)*p2; p2_negi = Complex(0,-1)*p2
p3 = Matrix[ [Complex(1), Complex(0)], [Complex(0), Complex(-1)]]; p3_neg = -1*p3; p3_i = Complex(0,1)*p3; p3_negi = Complex(0,-1)*p3
pauli_matrices = [id, id_neg, id_i, id_negi, p1, p1_neg, p1_i, p1_negi, p2, p2_neg, p2_i, p2_negi, p3, p3_neg, p3_i, p3_negi]

mult = Proc.new{ |a,b| a*b }
mult_pauli_matrices = Group.new(pauli_matrices, mult)
assert_equal true, mult_pauli_matrices.isGroup?
end

end


### Groups with Ruby

This image was generated by Vladimir Bulatov’s Polyhedra Stellations Applet: http://bulatov.org/polyhedra/stellation_applet

I’ve started writing some functions for verifying introductory group theory results in order to gain a better understanding of group theory. So far, my Group class is able to determine whether a particular discrete set and operator combination is a group.

Corrections or suggestions for more functionality would be appreciated. I plan on adding functions to calculate cosets and verify Lagrange’s theorem.

class Group

def initialize(set, block)
@set = set
@block = block
end

def isGroup?
has_identity = self.has_identity?
is_associative = self.is_associative?
is_closed = self.is_closed?
has_inverses = self.has_inverses?
isGroup = has_identity &amp;&amp; is_associative &amp;&amp; is_closed &amp;&amp; has_inverses
if not has_identity
p 'No identity'
end
if not is_associative
p 'Not associative'
end
if not is_closed
p 'Not closed'
end
if not has_inverses
p 'Missing inverses'
end
return isGroup
end

def is_closed?
is_closed = true
@set.each do |element|
@set.each do |other_element|
next_element = @block.call(element, other_element)
if not @set.include?(next_element)
is_closed = false
end
end
end
return is_closed
end

def has_identity?
@set.each do |element|
candidate_identity = true
@set.each do |other_element|
next_element = @block.call(element, other_element)
if not next_element == other_element
candidate_identity = false
end
end
if candidate_identity == true
return true
end
end
return false
end

def identity
has_identity = false
@set.each do |element|
candidate_identity = true
@set.each do |other_element|
next_element = @block.call(element, other_element)
if not next_element == other_element
candidate_identity = false
end
end
if candidate_identity == true
return element
end
end
puts 'No identity exists'
return nil
end

def is_associative?
is_associative = true
@set.each do |a|
@set.each do |b|
@set.each do |c|
element_ab_c = @block.call(@block.call(a, b), c)
element_a_bc = @block.call(a, @block.call(b,c))
if not element_ab_c == element_a_bc
is_associative = false
end
end
end
end
return is_associative
end

def has_inverses?
has_inverses = true
id = self.identity
@set.each do |element|
found_inverse = false
@set.each do |other_element|
next_element = @block.call(element, other_element)
if next_element == id
found_inverse = true
end
end
if found_inverse == false
has_inverses = false
end
end
return has_inverses
end

end


Unit tests:

require './group'
require 'test/unit'

class TestGroup &lt; Test::Unit::TestCase

int_mod10 = [0,1,2,3,4,5,6,7,8,9]
add_mod10 = Proc.new{ |a,b| (a+b)%10 }
end

def test_mult_int_mod10_isNotGroup
int_mod10 = [0,1,2,3,4,5,6,7,8,9]
mult = Proc.new{ |a,b| a*b }
mult_int_mod10 = Group.new(int_mod10, mult)
assert_equal false, mult_int_mod10.isGroup?
end

bool = [0,1]
add = Proc.new{ |a,b| a+b }
end

bool = [0,1]
add_mod2 = Proc.new{ |a,b| (a+b)%2 }
end

end


### Fast Mandelbrot set rendering

This demo is written for the Gameboy Advance. The GBA uses an ARM7 processor; the devkitpro toolchain can be used to compile Gameboy ROMs as described in the Tonc tutorials.

Plotting the Mandelbrot set isn’t particularly difficult. It does, however, plot very slowly (1 or 2 min for full resolution – 240×160 at 32 iterations per pixel ). It’s nicer if there’s something to look at while it’s rendering. In this example, the set is plotted at progressively higher resolution. Initially it’s plotted with 15×15 pixels and only samples the image once per 15×15 pixel block. The code avoids re-calculating pixels that have already been calculated by maintaining a list of the resolutions that have been plotted so far (implicitly tracking which pixels have been sampled). This reduces the computational requirements of this early/cheap plotting routine. The only real overhead is the extra time required to block blocks instead of pixels at the lower resolution.

Code based on Tonc’s m3_demo: http://asfarley.com/mandelbrot_demo.zip

### Recursive search

Recursive search tree: topic ‘ruby’

Goal: construct a tree populated with topics related to a keyword of interest. Algorithm:

1. Every day:
3. Follow the top 10 links. Get all HTML. Drop everything but text (javascript, formatting, other tags).
4. Collate text documents & split into words.
5. Remove “noise” words (discussed later) and occurrences of keyword
6. Sort words by frequency of occurrence.
7. Select the top n most frequency co-occurring terms and repeat.

Try it out here.

Undesired search results can be blocked in subsequent searches by clicking the red ‘x’ in each node. A JQuery event updates the noise list on the server end.

This was accomplished Rails, gems Nokogiriwheneversanitizeelif, open-uri, net/http, the Google Search API and JQuery.

The project’s code (also my homepage) is here on Github.

Noise is modelled by searching for 10 preselected, unrelated topics on Google. The algorithm described above is applied to the collated results (without the noise removal step) and the result is considered to be a general model of internet text background noise. The top most frequently occurring words are stored and blocked from appearing in recursive topic search trees.