Ruby.Concurrency.Monoids.

Julian Doherty

Senior Developer at Envato

(Yes, we're hiring)

Anecdote 1:Ptolemy Project

• Concurrency research project
• Ran fine for 4 years...
• ...Till it deadlocked. No one knew why
• Experts can't get this right

Anecdote 2:Gig I was on a few years ago

• Heavily concurrent Java system
• 2x single core CPUs
• Ran flawlessly for 3 years
• Then upgraded to 2x multi core CPU...
• And it DIED HORRIBLY every few hours
• 2 weeks of debugging, concurrency bug found in "threadsafe" code

we can do better

Constraints give you flexibility If you add the right constraints

https://secure.flickr.com/photos/josephb/5345589767/

Monoids

Just a type of constraint

In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element. Monoids are studied in semigroup theory as they are semigroups with identity

https://en.wikipedia.org/wiki/Monoid

Suppose that S is a set and • is some binary operation, S × S → S, then S with • is a monoid if it satisfies the following two axioms:

Associativity

For all a, b and c in S, the equation (a • b) • c = a • (b • c) holds.

Identity element

There exists an element e in S such that for every element a in S, the equations e • a = a • e = a hold.

https://en.wikipedia.org/wiki/Monoid You already know monoids

(assuming you went to primary school)

1 + 1 # 2
[:a :b] + [:c] # [:a, :b, :c]
{a: 1, b: 2}.merge({c: 3, a: 4}) # {a: 4, b: 2, c: 3}

Monoids are just things you already know... with some constraints

• immutable values that are members of some collection
• associative binary operator that "appends" values
• an identity value that doesn't change another value
• (may or may not be commutative)

Addition of natural numbers

• Operator is +
• Identity is 0

0 + 1       == 1
1 + 0       == 1

(1 + 2) + 3 == 6

1 + (2 + 3) == 6

Multiplication of positive numbers

• Operator is *
• Identity is 1

1 * 2       == 2
2 * 1       == 2

(2 * 3) * 4 == 24

2 * (3 * 4) == 24

Concatenating lists

• Operator is +
• Identity is empty list []

[] + [:a, :b]            == [:a, :b]
[:a, :b] + []            == [:a, :b]

([:a, :b] + [:c]) + [:d] == [:a, :b, :c, :d]

[:a, :b] + ([:c] + [:d]) == [:a, :b, :c, :d]

Merging hashes

• Operator is #merge()
• Identity is empty hash {}

{}.merge({a: 1})                               == {a: 1}
{a: 1}.merge({})                               == {a: 1}

{a: 1}.merge({b: 2, c: 3}).merge({d: 4, a: 5}) == {a: 5, b: 2, c: 3, d: 4}

{a: 1}.merge({b: 2, c: 3}.merge({d: 4, a: 5})) == {a: 5, b: 2, c: 3, d: 4}

Even works for objects with composition

• Operator is #mappend()
• Identity is .mempty()

(names stolen from Haskell)

class MyFoo < Struct.new(:list, :number)
def self.mempty
new([], 0)
end

def mappend(other)
MyFoo.new(self.list + other.list, self.number + other.number)
end

# NOTE object is immutable!
private :list=, :number=
end

f1 = MyFoo.new([:a, :b], 1)
f2 = MyFoo.new([:c], 2)
f3 = MyFoo.new([:d], 3)

f1.mappend(MyFoo.mempty) == f1
MyFoo.mempty.mappend(f1) == f1

f1.mappend(f2).mappend(f3) == f1.mappend(f2.mappend(f3))

BUT

No reason transformations have to be done sequentially in single process/thread/fibre...

def scary
value = []

a = Thread.new do
sleep rand
value << :a
end

b = Thread.new do
sleep rand
value << :b
end

a.join
b.join

value
end

Lets rewrite it to be monoidic

def not_scary
a = Thread.new do
sleep rand
[:a]
end

b = Thread.new do
sleep rand
[:b]
end

a.value + b.value
end

Building a complex search engine query

build a basic query object to represent user input

# build the query bit by bit
# in practice these get really complicated, and are each their own class
basic_query = Query.mempty() #empty query as starting point
basic_query = basic_query.mappend(Query.new(term: term))
# ...

run multiple subqueries on search engine to get extra data using basic query object combined with extra options

# AHA! We can get a transformed copy of the query to do other subqueries with it,
# all while leaving the original alone for later use. And we can safely do it async!
basic_query.mappend( Query.new(aggregate: :max_sales)).execute!
end

# and again!
basic_query.mappend( Query.new(aggregate: avg_price)).execute!
}

# combine the results with monoid operator
basic_query = basic_query.mappend(max_sales_count: max_sales_thread.value)
basic_query = basic_query.mappend(avg_price: avg_price_thread.value)

execute main query

results = basic_query.execute!

run multiple extra queries on results to cleanse results using original basic query combined with result data

related_search_threads = results.related_searches.map do|search_term|
Thread.new do # start a thread for each separate term in parallel
# remember basic_query? Still it's same immutable self :)
related_query = basic_query.mappend(Query.new(term: related_search_term)
related_query = related_query.mappend(Query.new(aggregate: :exists)
related_query.execute!
end
end

# remove related ones that don't have any results
related_search_terms = related_search_threads.select{|t| t.value.exists? }

final_result = result.mappend(Results.new(related: related_search_terms)

Summary

• Monoids are fun!
• If you've done primary school maths, you're already using them
• Just the same thing you're already doing - with constraints
• Great way to isolate logic
• Lets you compose transformations
• Immutable
• Immutability, isolation, and composition are awesome used with concurrency