Onoffswitch

Posts

Sep 21, 2021
Consistent Hashing
Suppose we want to provide a percentage based rollout of features to a set of users without storing the full set of features for each user. It may sound overkill, but if you have millions of users and 50 features, suddenly you are talking about storing 50M items and thats no small amount.
Aug 13, 2019
JIRA CLI Tooling
Aug 6, 2019
Typed react native router
Jul 30, 2019
Tools Matter
Jul 30, 2019
Thoughts on monorepos
Apr 27, 2019
Infra graphs with neo4j
I spent some time recently mucking around with neo4j attempting to model infrastructure, incidents, teams, users, etc. Basically what does it take to answer questions about organizations.
Apr 2, 2019
8 months of go
For the past 8 months I’ve primarily been writing in go. I was hesitant to take a job that used go as its primary language for a lot of reasons, but I decided to give it a try because a lot of companies these days are using it, and it doesn’t hurt to broaden my skillset. In this post I’ll describe the pros and cons of using go from my own experience.
May 17, 2018
Productionalizing ECS

This post was originally posted on my company’s engineering blog here: http://engineering.curalate.com/2018/05/16/productionalizing-ecs.html
May 11, 2018
Debugging "Maximum String literal length exceeded" with scala
Today I ran into a fascinating bug. We use ficus as a HOCON auto parser for scala. It works great, because parsing configurations into strongly typed case classes is annoying. Ficus works by using a macro to invoke implicitly in scope Reader[T] classes for data types and recursively builds the nested parser.
May 3, 2018
AETR an open source workflow engine
For the past several years I’ve been thinking about the idea of an open source workflow execution engine. Something like AWS workflow but simpler. No need to upload python, or javascript, or whatever. Just call an API with a callback url, and when the API completes its step, callback to the coordinator with a payload. Have the coordinator then send that payload to the next step in the workflow, etc.
Feb 17, 2018
Chaos monkey for docker
I work at a mostly AWS shop, and while we still have services on raw EC2, nearly all of our new development is on Amazon ECS in docker. I like docker because it provides a unified unit of operation (a container) that makes it easy to build shared tooling regardless of language/application. It also lets you reproduce your applications local in the same environment they run remote, as well as starting fast and deploying fast.
Feb 15, 2018
Tracking batch queue fanouts

Edit: This code now exists at https://github.com/paradoxical-io/carlyle
Nov 17, 2017
Sbt sonatypeRelease on Travis
I figured I’d drop a quick note here for anyone else running into an issue. If you are trying to do a sonatypeRelease via sbt 1.0.3 on travis and getting a
Nov 8, 2017
Functors in scala
A coworker of mine and I frequently talk about higher kinded types, category theory, and lament about the lack of unified types in scala: namely functors. A functor is a fancy name for a thing that can be mapped on. Wanting to abstract over something that is mappable comes up more often than you think. I don’t necessarily care that its an Option, or a List, or a whatever. I just care that it has a map.
Oct 6, 2017
Tracing High Volume Services

This post was originally posted at engineering.curalate.com
Aug 14, 2017
Design patterns
I was asked by a coworker to help write up some simple examples for junior engineers explaining some of the gang of four design patterns in a simpler more digestable format. I took a stab at this this weekend and figured I’d share it to anyone who stumbles here. It’s hosted on github pages and available via github:
Jul 25, 2017
From Thrift to Finatra

Originally posted on the curalate engineering blog
Jul 25, 2017
The HTTP driver pattern
Yet another SOA blog post, this time about calling services. I’ve seen a lot of posts, articles, even books, on how to write services but not a good way about calling services. It may seem trivial, isn’t calling a service a matter of making a web request to one? Yes, it is, but in a larger organization it’s not always so trivial.
Apr 9, 2017
Bit packing Pacman
Haven’t posted in a while, since I’ve been heads down in building a lot of cool tooling at work (blog posts coming), but had a chance to mess around a bit with something that came up in an interview question this week.
Feb 12, 2017
Strongly typed http headers in finatra
When building service architectures one thing you need to solve is how to pass context between services. This is usually stuff like request id’s and other tracing information (maybe you use zipkin) between service calls. This means that if you set request id FooBar123 on an entrypoint to service A, if service A calls service B it should know that the request id is still FooBar123. The bigger challenge is usually making sure that all thread locals keep this around (and across futures/execution contexts), but before you attempt that you need to get it into the system in the first place.
Nov 26, 2016
Dont be afraid of dependency updates
Lots of place I’ve worked at have had an irrational fear of upgrading their dependencies. I understand why, when you have something that works you don’t want to rock the boat. You want to focus on building your product, not dealing with potential runtime errors. Your ops team is happy, things are stable. Life is great.
Nov 8, 2016
Deployment the paradoxical way
First and foremost, this is all Jake Swensons brain child. But it’s just too cool to not share and write about. Thanks Jake for doing all the hard work :)
Oct 16, 2016
Coproducts and polymorphic functions for safety
I was recently exploring shapeless and a coworker turned me onto the interesting features of coproducts and how they can be used with polymorphic functions.
Oct 16, 2016
CassieQ @ Cassandra Summit
I had the great chance to talk at Cassandra summit 2016 this year about cassieq, the project I worked on with Jake Swenson at Paradoxical. For anyone interested, here’s the video!
Sep 22, 2016
Mocking nested objects with mockito
Yes, I know its a code smell. But I live in the real world, and sometimes you need to mock nested objects. This is a scenario like:
Aug 3, 2016
Extracting scala method names from objects with macros
I have a soft spot in me for AST’s ever since I went through the exercise of building my own language. Working in Java I missed the dynamic ability to get compile time information, though I knew it was available as part of the annotation processing pipleine during compilation (which is how lombok works). Scala has something similiar in the concept of macros: a way to hook into the compiler, manipulate or inspect the syntax tree, and rewrite or inject whatever you want. It’s a wonderfully elegant system that reminds me of Lisp/Clojure macros.
Jul 26, 2016
Dealing with a bad symbolic reference in scala
Every time this hits me I have to think about it. The compiler barfs at you with something ambiguous like
Jun 8, 2016
Scripting deployment of clusters in asgard
We use asgard at work to do deployments in both qa and production. Our general flow is to check in, have jenkins build, an AMI is created, and then … we have to manually go to asgard and deploy it. That sucks.
Jun 3, 2016
Unit testing DNS failovers
Something that’s come up a few times in my career is the difficulty of validating if and when your code can handle actual DNS changes. A lot of times testing that you have the right JVM settings and that your 3rd party clients can handle it involves mucking with hosts files, nameservers, or stuff like Route53 and waiting around. Then its hard to automate and deterministically reproduce. However, you can hook into the DNS resolution in the JVM to control what gets resolved to what. And this way you can tweak the resolution in a test and see what breaks! I found some info at this blog post and cleaned it up a bit for usage in scala.
Apr 19, 2016
CassieQ at the Seattle Cassandra Users Meetup
Last night Jake and I presented CassieQ (the distributed message queue on cassandra) at the seattle cassandra users meetup at the Expedia building in Bellevue. Thanks for everyone who came out and chatted with us, we certainly learned a lot and had some great conversations regarding potential optimizations to include in CassieQ.
Mar 24, 2016
Consistent hashing for fun
I think consistent hashing is pretty fascinating. It lets you define a ring of machines that shard out data by a hash value. Imagine that your hash space is 0 -> Int.Max, and you have 2 machines. Well one machine gets all values hashed from 0 -> Int.Max/2 and the other from Int.Max/2 -> Int.Max. Clever. This is one of the major algorithms of distributed systems like cassandra and dynamoDB.
Feb 21, 2016
A toy generational garbage collector
Had a little downtime today and figured I’d make a toy generational garbage collector, for funsies. A friend of mine was once asked this as an interview question so I thought it might make for some good weekend practice.
Feb 5, 2016
RMQ failures from suspended VMs
My team recently ran into a bizarre RMQ partition failure in a production cluster. RMQ doesn’t handle partition failures well, and while you can set up auto recovery (such as suspension of minority groups) you need to manually recover from it. The one time I’ve encountered this I got a very useful message in the admin managment page indicating that parts of the cluster were in partition failure, but this time things went weird.
Jan 29, 2016
Logging the easy way

This is a cross post from the original posting at godaddy’s engineering blog. This is a project I have spent considerable time working on and leverage a lot.
Jan 27, 2016
Serialization of lombok value types with jackson
For anyone who uses lombok with jackson, you should checkout jackson-lombok which is a fork from xebia that allows lombok value types (and lombok generated constructors) to be json creators.
Jan 24, 2016
Cassandra DB migrations
When doing any application that involves a persistent data storage you usually need a way to upgrade and change your database using a set of scripts. Working with patterns like ActiveRecord you get easy up/down by version migrations. But with cassandra, which traditionally was schemaless, there aren’t that many tools out there to do this.
Jan 22, 2016
Dalloc - coordinating resource distribution using hazelcast
A fun problem that has come up during the implementation of cassieq (a distributed queue based on cassandra) is how to evenly distribute resources across a group of machines. There is a scenario in cassieq where writes can be delayed, and as such there is a custom worker in the app (by queue) who watches a queue to see if a delayed write comes in and republishes the message to a bucket later on. It’s transparent to the user, but if we have multiple workers on the same queue we could potentially republish the message twice. While technically that falls within the SLA we’ve set for cassieq (at least once delivery) it’d be nice to avoid this particular race condition.
Jan 22, 2016
Leadership election with cassandra
Cassandra has a neat feature that lets you expire data in a column. Using this handy little feature, you can create simple leadership election using cassandra. The whole process is described here which talks about leveraging Cassandras consensus and the column expiration to create leadership electors.
Dec 12, 2015
Plugin class loaders are hard
Plugin based systems are really common. Jenkins, Jira, wordpress, whatever. Recently I built a plugin workflow for a system at work and have been mired in the joys of the class loader. For the uninitiated, a class in Java is identified uniquely by the class loader instance it is created from as well as its fully qualified class name. This means that foo.bar class loaded by class loader A is not the same as foo.bar class loaded by class loader B.
Nov 28, 2015
Project angelhair: Building a queue on cassandra
Edit: this project has since been moved to CassieQ: https://github.com/paradoxical-io/cassieq
Oct 19, 2015
Dynamic HAProxy configs with puppet
I’ve posted a little about puppet and our teams ops in the past since my team has pretty heavily invested in the dev portion of the ops role. Our initial foray into ops included us building a pretty basic puppet role based system which we use to coordinate docker deployments of our java services.
Aug 16, 2015
Adventures in pretty printing JSON in haskell
Today I gave atom haskell-ide a whirl and wanted to play with haskell a bit more. I’ve played with haskell in the past and always been put off by the tooling. To be fair, I still kind of am. I love the idea of the language but the tooling is just not there to make it an enjoyable exploratory experience. I spend half my time in the repl inspecting types, the other half on hoogle, and the 3rd half (yes I know) being frustrated that I can’t just type in package names and explore API’s in sublime or atom or wherever I am. Now that I’m on a mac, maybe I’ll give leksah another try. I tried it a while ago it didn’t work well.
Aug 14, 2015
Automating deployments with salt, puppet, jenkins and docker
I know, its a buzzword mouthful. My team has had good first success leveraging jenkins, salt, sensu, puppet, and docker to package and monitor distributed java services with a one click deployment story so I wanted to share how we’ve set things up.
Jul 8, 2015
Testing puppet with docker and python
In all the past positions I’ve been in I’ve been lucky enough to have a dedicated ops team to handle service deployment, cluster health, and machine managmenent. However, at my new company there is much more of a “self serve” mentality such that each team needs to handle things themselves. On the one hand this is a huge pain in my ass, since really the last thing I want to do is deal with clusters and machines. On the other hand though, because we have the ability to spin up openstack boxes in our data centers at the click of a button, each team has the flexibility to host their own infrastructrure and stack.
May 7, 2015
Converting akka scala futures to java futures
Back in akka land! I’m using the ask pattern to get results back from actors since I have a requirement to block and get a result (I can’t wait for an actor to push at a later date). Thats fine, but converting from scala futures to java completable futures is a pain. I also, (like mentioned in another post) want to make sure that my async responses capture and set the MDC for proper logging.
Apr 13, 2015
Shareable zsh environment: EnvZ
Introducing EnvZ.
Apr 7, 2015
Adding MDC logging to akka
I’ve mentioned before, but I’m working heavily in a project that is leveraging akka. I am really enjoying the message passing model and so far things are great, but tying in an MDC for the SLFJ logging context proved complicated. I had played with the custom executor model described here but hadn’t attempted the akka custom dispatcher.
Mar 30, 2015
Getting battery percentage in zsh
I’m on osx maverick still at home on my laptop and I spent part of today dicking around customizing my zsh shell. I wanted to be able to show my battery percentage in the shell and it’s really pretty easy.
Mar 25, 2015
Handling subclassed constraints with a DSL in java 8
I really like doing all of my domain modeling with clean DSL’s (domain specific languages). Basically I want my code to read like a sentence, and to hide all the magic behind things. When things read clearly even a non professional can determine if something is wrong. The ideal scenario is to have your code read like pseudocode since nobody really cares what the internals are, what matters is your general solution.
Mar 15, 2015
Installing leinigen on windows
Figured I’d spend part of the afternoon and play with clojure but was immediately thwarted trying to install leiningen on windows via powershell. I tried the msi installer but it didn’t seem to do anything, so I went to my ~/.lein/bin folder and ran
Mar 14, 2015
Simplifying class matching with java 8
I’m knee deep in akka these days and its a great queueing framework, but unfortunately I’m stuck using java and not able to use scala (business decisions, not mine!) so pattern matching on incoming untyped events can be kind of nasty.
Mar 12, 2015
Auto scaling akka routers
I’m working on a project where I need to multiplex many requests through a finite set of open sockets. For example, I have 200 messages, but I can only have at max 10 sockets open. To accomplish this I’ve wrapped the sockets in akka actors and am using an akka routing mechanism to “share” the 10 open sockets through a roundrobin queue.
Feb 5, 2015
Tiny types scala edition
Previously I wrote about generating value type wrappers on top of C# primitives for better handling of domain level knowledge. This time I decided to try it out in scala as I’m jumping into the JVM world.
Feb 3, 2015
Simple log context wrapper
I’m still toying around with the scala play! framework and I wanted to check out how I can make logging contextual information easy. In the past with .NET I’ve used and written libraries that wrap the current log provider and give you extra niceties with logging. One of my favorites was being able to do stuff like
Jan 30, 2015
Conditional injection with scala play and guice
It’s been a crazy year for me. For those who don’t know I moved from the east coast to the west coast to work for a rather large softare company in seattle (I’ll let you figure which one out) and after a few short weeks realized I made a horrible mistake and left the team. I then found a cool job at a smaller .net startup that was based in SF and met some awesome people and learned a lot. But, I’ve been poached by an old coworker and am now going to go work at a place that uses more open source things so I decided to kick into gear and investigate scala and play.
Jan 6, 2015
Quickly associate file types with a default program
I use JuJuEdit to open all my log files since it starts up fast, is pretty bare bones, but better than notepad. The way my log4net appender is set up is that log files are kept for 10 days and get a .N appended to them for each backup. I.e.
Jan 6, 2015
Creating stronger value type contracts
I’ve long been annoyed that value types don’t have strong semantic information attached to them such that the compiler would barf if I try and pass an value type that isn’t semantically the same as what the function wanted. For example, what does the following signature mean other than than taking in 2 ints and returning a bool?
Aug 24, 2014
AngularJS for .Net developers
A few months ago I was asked to be a technical reviewer on a new packt pub book called AngularJS for .Net developers. It mostly revolves around ServiceStack (not web API) and building a full stack application with angular. I actually really enjoyed reading it and thought it touched on a lot of great points that a developer who is serious needs to know about.
Aug 24, 2014
Leveraging message passing to do currying in ruby
I’m not much of a ruby guy, but I had the inkling to play with it this weekend. The first thing I do when I’m in a new language is try to map constructs that I’m familiar with, from basic stuff like object instantiation, singletons, inheritance, to more complicated paradigms like lambdas and currying.
Aug 4, 2014
Sometimes you have to fail hard

This was a post I wrote in the middle of 2013 but never published. I wanted to share this since it’s a common story across all technologies and developers of all skill levels. Sometimes things really just don’t work. As a post-script, I did come back to this project and had a lot of success. When in doubt, let time figure it out :)
Jul 17, 2014
wcf Request Entity Too Large
I ran into a stupid issue today with WCF request entity too large errors. If you’re sure your bindings are set properly on both the server and client, make sure to double check that the service name and contract’s are set properly in the server.
Jul 10, 2014
Short and sweet powershell prompt with posh-git
My company has fully switched to git and it’s been great. Most people at work use SourceTree as a gui to manage their git workflow, some use only command line, and I use a mixture of posh-git in powershell with tortoise git when I need to visualize things.
Jul 7, 2014
Multiple SignalR clients and ASMX service calls from the same application
I was writing a test application to simulate what multiple signalR clients to a server would act like. The clients were triggered by the server and then would initiate a sequence of asmx web service calls back to the server using a legacy web service. This way I was using signalR as a triggering mechanism and not as a data transport. For my purpose this worked out great.
Jun 2, 2014
Constraint based sudoku solver
A few weekends ago I decided to give solving Sudoku a try. In case you aren’t familiar with Sudoku, here is what an unsolved board looks like
May 12, 2014
Creating futures
Futures (and promises) are a fun and useful design pattern in that they help encapsulate asynchronous work into composable objects. That and they help hide away the actual asynchronous execution implementation. It doesn’t matter if the future is finally resolved on the threadpool, in a new thread, or in an event loop (like nodejs).
Apr 28, 2014
Instagram viewer with node and angular
I have an artist buddy who is working on an art installation and asked me if there was a way to display a realtime view of an instagram hashtag feed on a projector.
Apr 14, 2014
Building a prefix trie
Prefix trie’s are cool data structures that let you compress a dictionary of words based on their shared prefix. If you think about it, this makes a lot of sense. Why store abs, abbr, and abysmal when you only need to store a,b,b,r,s,y,s,m,a,l. Only storing what you have to (based on prefix) in this example gives you a 70% compression ratio! Not too bad, and it would only get better the more words you added.
Apr 4, 2014
Avoiding nulls with expression trees
I’ve blogged about this subject before, but I REALLY hate null refs. This is one of the reasons I love F# and other functional languages, null ref’s almost never happen. But, in the real world I work as a C# dev and have to live with C#’s… nuisances.
Mar 23, 2014
Strongly typed powershell csv parser
Somehow I missed the powershell boat. I’ve been a .NET developer for years and I trudged through using the boring old cmd terminal, frequently mumbling about how much I missed zsh. But something snapped and I decided to really dive into powershell and learn why those who use it really love the hell out of it. After realizing that the reason everyone loves it is because everything is strongly typed and you can use .NET in your shell I was totally sold.
Mar 10, 2014
A simple templating engine
I wanted to talk about templating, since templating is a common thing you run into. Often times you want to cleanly do a string replace on a bunch of text, and sometimes even need minimal language processing to do what you want. For example, Java has a templating engine called Velocity, but lots of languages have libraries that do this kind of work. I thought it’d be fun to create a small templating engine from scratch with F# as an after work exercise.
Mar 8, 2014
RxJava Observables and Akka actors
I was playing with both akka and rxjava and came across the following post that described how to map rxjava observables from messages posted to akka actors.
Feb 27, 2014
Debugging F# NUnit equals for mixed type tuples
Twitter user Richard Dalton asked a great question recently:
Feb 27, 2014
Single producer many consumer
When I’m bored, I like to roll my own versions of things that already exist. That’s not to say I use them in production, but I find that they are great learning tools. If you read the blog regularly you probably have realized I do this A LOT. Anyways, today is no different. I was thinking about single producer, multiple consumer functions, like an SNS Topic, but for your local machine. In reality, the best way to do this would be to publish your event through an Rx stream and consume it with multiple subscribers, but that’s no fun. I want to roll my own!
Feb 24, 2014
Building LINQ in Java pt 2
In my last post I discussed building a static class that worked as the fluent interface exposing different iterator sources that provide transformations. For 1:1 iterators, like take, skip, while, for, nth, first, last, windowed, etc, you just do whatever you need to do internally in the iterator by manipulating the output the stream.
Feb 3, 2014
Logitech mx mouse zoom button middle click on Ubuntu
Any good engineer has their own tools of their trade: keyboard, mouse, and licenses to their favorite editors (oh and a badass chair).
Jan 27, 2014
Filter on deep object properties in angularjs
AngularJS provides a neat way of filtering arrays in an ng-repeat tag by piping elements to a built in filter filter which will filter based on a predicate. Doing it this way you can filter items based on a function, or an expression (evaluated to a literal), or by an object.
Jan 20, 2014
A daily programmer - nuts and bolts
I’ve mentioned r/dailyprogrammer in previous posts, since I think they are fun little problems to solve when I have time on my hands. They’re also great problem sets to do when learning a new language.
Jan 13, 2014
Getting started with haskell
I wanted to share how I’ve finally settled on my haskell development environment and how I got it set up, since the process in the end wasn’t that trivial. Hopefully anyone else starting in haskell can avoid the annoyances and pitfalls that I ran into and get up and running (and doing haskell) quickly.
Dec 30, 2013
Building LINQ in Java
Now that Java 8 has lambdas, I decided to check out what kind of lazy collection support their streams functionality had. It had some cool stuff, like
Dec 16, 2013
Checking if a socket is connected
Testing if a socket is still open isn’t as easy at it sounds. Anyone who has ever dealt with socket programming knows this is hassle. The general pattern is to poll on the socket to see if its still available, usually by sitting in an infinite loop. However, with f# this can be done more elegantly using async and some decoupled functions.
Dec 11, 2013
Pulling back all repos of a github user
I recently had to relinquish my trusty dev machine (my work laptop) since I got a new job, and as such am relegated to using my old mac laptop at home for development until I either find a new personal dev machine or get a new work laptop. For those who don’t know, I’m leaving the DC area and moving to Seattle to work for Amazon, so that’s pretty cool! Downside is that it’s Java and Java kind of sucks, but I can still do f#, haskell, and all the other fun stuff on the side.
Dec 3, 2013
F# utilities in haskell
Slowly I am getting more familiar with Haskell, but there are some things that really irk me. For example, a lot of the point free functions are right to left, instead of left to right. Coming from an F# background this drives me nuts. I want to see what happens first first not last.
Dec 2, 2013
24 hour time ranges
Dealing with time is hard, it’s really easy to make a mistake. Whenever I’m faced with a problem that deals with time I tend to spend an inordinate amount of time making sure I’m doing things right.
Nov 18, 2013
Java lambdas
I’m not a java person. I’ve never used it in production, nor have I spent any real time with it outside of my professional work. However, when a language dawns upon lambdas I am drawn to try out their implementation. I’ve long since despised Java for the reasons of verbosity, lack of real closures or events, type erasure in generics, and an over obsession with anonymous classes, so I’ve shied away from doing anything in it.
Oct 30, 2013
Reading socket commands
A few weeks ago I was working on a sample application that would simulate a complex state machine. The idea is that there is one control room, and many slave rooms, where each slave room has its own state. The control room can dispatch a state advance or state reverse to any room or collection of rooms, as well as query room states, and other room metadata.
Oct 14, 2013
The Arrow operator
Continuing my journey in functional programming, I decided to try doing the 99 haskell problems to wean my way into haskell. I’ve found this to be a lot of fun since they give you the answers to each problem and, even though I have functional experience, the haskell way is sometimes very different from what I would have expected.
Sep 30, 2013
Review of my first time experience with haskell editors
When you start learning a new language the first hurdle to overcome is how to edit, compile, and debug an application. In my professional career I rely heavily on visual studio and intellij IDEA as my two IDE workhorses. Things just work with them. I use visual studio for C#, C++, and F# development and IDEA for everything else (including scala, typescript, javascript, sass, ruby, and python).
Sep 26, 2013
Machine Learning with disaster video posted
A few weeks ago we had our second DC F# meetup with speaker Phil Trelford where he led a hands on session introducing decision trees. The goal of meetup was to see how good of a predictor we could make of who would live and die on the titanic. Kaggle has an excellent data set that shows age, sex, ticket price, cabin number, class, and a bunch of other useful features describing Titanic passengers.
Sep 20, 2013
Till functions
Just wanted to share a couple little functions that I was playing with since it made my code terse and readable. At first I needed a way to fold a function until a predicate. This way I could stop and didn’t have to continue through the whole list. Then I needed to be able to do the same kind of thing but choosing all elements up until a predicate.
Sep 16, 2013
Angular with typescript architecture
Bear with me, this is going to be a long post.
Sep 10, 2013
Seq.unfold and creating bit masks
In the course of working on ParsecClone I needed some code that could take in an arbitrary byte array and convert it to a corresponding bit array. The idea is if I have an array of
Sep 10, 2013
Thinking about haskell functors in .net
I’ve been teaching myself haskell lately and came across an interesting language feature called functors. Functors are a way of describing a transformation when you have a boxed container. They have a generic signature of
Aug 25, 2013
ParsecClone on nuget
Today I published the first version of ParsecClone to nuget. I blogged recently about creating my own parser combinator and it’s come along pretty well. While FParsec is more performant and better optimized, mine has other advantages (such as being able to work on arbitrary consumption streams such as binary or bit level) and work directly on strings with regex instead of character by character. Though I wouldn’t recommend using ParsecClone for production string parsing if you have big data sets, since the string parsing isn’t streamed. It works directly on a string. That’s still on the todo list, however the binary parsing does work on streams.
Aug 25, 2013
Machine learning from disaster
If any of my readers are in the DC/MD/VA area you should all come to the next DC F# meetup that I’m organizing on september 16th (monday). The topic this time is machine learning from disaster, and we’ll get to find out who lives and dies on the Titanic! We’re bringing in guest speaker Phil Trelford so you know its going to be awesome! Phil is in the DC area on his way to the F# skills matters conference in NYC a few days later. I won’t be there but I expect that it will be top notch since all the big F# players are there (such as Don Syme and Tomas Petricek)!.
Aug 25, 2013
Implementing the game "Arithmetic"
There is a subreddit on reddit called /r/dailyprogrammer and while they don’t actually post exercises daily, they do sometimes post neat questions that are fun to solve. About a week ago, they posted a problem that I solved with F# that I wanted to share. For the impatient, my full source is available at this fssnip.
Aug 22, 2013
Tech talk: Pattern matching
Today’s tech talk was about functional pattern matching. This was a really fun one since I’ve been sort of “evangelizing” functional programming at work, and it was a blast seeing everyone ask poignant and intersting questions regarding pattern matching.
Aug 19, 2013
Parse whatever with your own parser combinator
In a few recent posts I talked about playing with fparsec to parse data into usable syntax trees. But, even after all the time spent fiddling with it, I really didn’t fully understand how combinators actually worked. With that in mind, I decided to build a version of fparsec from scratch. What better way to understand something than to build it yourself? I had one personal stipulation, and that was to not look at the fparsec source. To be fair, I cheated with one function (the very first one) so I kind of cheated a lot, but I didn’t peek at anything else, promise.
Aug 19, 2013
Coding Dojo: a gentle introduction to Machine Learning with F# review
Recently I organized an F# meetup in DC, and for our first event we brought in a wonderful speaker (Mathias Brandewinder) who’s topic was called: “Coding Dojo: a gentle introduction to Machine Learning with F#”.
Aug 14, 2013
F# class getter fun
I was playing with Neo4J (following a recent post I stumbled upon by Sergey Tihon), and had everything wired up and ready to test out, but when I tried running my code I kept getting errors saying that I hadn’t connected to the neo4j database. This puzzled me because I had clearly called connect, but every time I tried to access my connection object I got an error.
Aug 5, 2013
Trees and continuation passing style
For no reason in particular I decided to revisit tree traversal as a kind of programming kata. There are two main kinds of tree traversal:
Jul 29, 2013
Strongly typing SignalR
I’m a big fan of strong typing. If you can leverage the compiler to give you an error (or warning) before you deploy code, all the better. That means you won’t, ideally, push a bug into the field. So I have a big problem with frameworks and libraries that rely on dynamic objects, or even worse, stringly typing thing. Don’t get me wrong, sometimes dynamics are the only way to solve the problem, but whenever I run into one I’m always afraid that I’m going to get a runtime error since I don’t really know what I’m acting on till later.
Jul 26, 2013
F# and Machine learning Meetup in DC
As you may have figured out, I like F# and I like functional languages. At some point I tweeted to the f# community lamenting that there was a dearth of F# meetups in the DC area. Lo and behold, tons of people replied saying they’d be interested in forming one, and some notable speakers piped up and said they’d come and speak if I set something up.
Jul 22, 2013
SignalR on ios and a single domain
Safari on ios has a limitation that you can only have one concurrent request to a particular domain at a time. Normally this is fine, since once a request completes the next one that is queued up fires off. But what if you are using a realtime persistent connection library like signalR? In this case your one allowed connection is held up with the signalR request. If you’re not on a mac or linux and you use windows 7 or earlier you can’t use websockets so you’re stuck using http. Most suggestions involve buying a second domain, but sometimes thats not possible, especially if your application is a distributable web app that can run on client machines. You can’t expect clients to have to buy a second domain just so your realtime push works.
Jul 18, 2013
Tech talk: CLR Memory Diagnostics
Today’s tech talk we discussed the recent release from Microsoft of ClrMD that lets you attach and debug processes using an exposed API. You used to be able to do this in WinDbg using the SOS plugin, but now they’ve wrapped SOS in a managed dll that you can use to inspect CLR process information. The nice thing about this is you can now automate debugging inspections. It’s now as easy as
Jul 15, 2013
Reworking my language parser with fparsec
Since I was playing with fparsec last week, I decided to redo (or mostly) the parser for my homebrew language that I’ve previously posted about. Using fparsec made the parser surprisingly succinct and expressive. In fact I was able to do most of this in an afternoon, which is impressive consideringmy last C# attempt took 2 weeks to hammer out.
Jul 7, 2013
Locale parser with fparsec
Localizing an application consists of extracting out user directed text and managing it outside of hardcoded strings in your code. This lets you tweak strings without having to recompile, and if done properly, allows you to support multiple languages. Localizing is no easy task, it messes up spacing, formatting, name/date other cultural information, but thats a separate issue. The crux of localizing is text.
Jul 1, 2013
Linear separability and the boundary of wx+b
In machine learning, everyone talks about weights and activations, often in conjunction with a formula of the form wx+b. While reading machine learning in action I frequently saw this formula but didn’t really understand what it meant. Obviously its a line of some sort, but what does the line mean? Where does w come from? I was able to muddle past this for decision trees, and naive bayes, but when I got to support vector machines I was pretty confused. I wasn’t able to follow the math and conceptually things got muddled.
Jun 24, 2013
Ordered Consumable
I had the need for a specific collection type where I would only ever process an element once, but be able to arbitrarily jump around and process different elements. Once a jump happened, the elements would be processed in circular order: continue to the end, then loop around to the beginning and process any remaining items.
Jun 17, 2013
Threadpooling in netduino
Sometimes you want to do asynchronous work without holding up your current thread but the work that needs to be done doesn’t really warrant the cost of spinning up a new thread (though what the exact cost is on an embedded environment I’m not sure).
Jun 12, 2013
Qconn NYC 2013
If anyone is at qconn this year come find me (I’m wearing an adult swim hoodie)! There won’t be a tech talk this week since I’m busy at the conf but things will return back to normal next week.
Jun 10, 2013
Automatic fogbugz triage with naive bayes
At my work we use fogbugz for our bugtracker and over the history of our company's lifetime we have tens of thousands of cases. I was thinking recently that this is an interesting repository of historical data and I wanted to see what I could do with it. What if I was able to predict, to some degree of acuracy, who the case would be assigned to based soley on the case title? What about area? Or priority? Being able to predict who a case gets assigned to could alleviate a big time burden on the bug triager.

Thankfully, I'm reading "Machine Learning In Action" and came across the naive bayes classifier, which seemed a good fit for me to use to try and categorize cases based on their titles. Naive bayes is most famously used as part of spam filtering algorithms. The general idea is you train the classifier with some known documents to seed the algorithm. Once you have a trained data set you can run new documents through it to see what they classify as (spam or not spam).

For those who've never used Fogbugz, let me illustrate the data that's available to me. I've highlighted a few areas I'm going to use. The title is what we're going to use as the prediction value (highlighted blue), and the other red highlights are categories I want to predict (area, priority, and who the case is assigned to).

For the impatient, full source code of my bayes classifier is available on my github.

Conditional probability

Conditional probability describes the probability of an item given you already know something about it. Formally it's described in the syntax of P(A | B), which is pronounced as "probability of A given B". A good example is provided for in the machine learning book. Imagine you have 7 marbles. 3 white marbles, and 4 black marbles. Whats the probability of a white marble? It's 3/7. How about a black marble? It's 4/7.

Now imagine you introduce two buckets: a blue bucket and a red bucket. In the red bucket, you have 2 white marbles and 2 black marbles. In the blue bucket you have 1 white marble and 2 black marbles. Whats the probability of getting a white marble from the blue bucket? It's 1/3. There is only one white marble in the blue bucket, and 3 total marbles. So, P(white marble | blue bucket) is 1/3.

Bayes Formula

This doesn't really help though. What you really want is to be able to calculate P(red bucket | white marble). This is where bayes rule comes into play:

This formula describes how items and their conditions relate (marbles and buckets).

Conditional Independence

Naive bayes is called naive because it assumes that each occurrence of an item is just as likely as any other. Getting a white marble isn't dependent on first getting a black marble. To put it another way, the word "delicious" is just as likely to be next to "sandwich" as it is to "stupid". It's not really the case. In reality "delicious" is much more likely to be next to "sandwich" than "stupid".

The naive portion is important to note, because it allows us to use the following property of conditionally independent data:

What this formula means is that the probability of one thing AND another thing is the probability of each multiplied together. This applies to us since if the text is composed of words, and words are conditionally independent, then we can use the above property to determine the probability of text. In other words, you can expand P(text | spam) to be

```
text = word1 ∪ word2 ∪ word3 ∪ ... ∪ wordN

P(text | spam) = P(word1 | spam)*P(word2 | spam)...*P(wordN | spam)
Jun 8, 2013
Tech talk: B-Trees
Yesterdays tech talk was on b-trees. B-trees are an interesting tree data structure that are used to minimize disk read access. Also, since they are self balancing, and optimized for sequential reads and inserts, they’re really good for file systems and databases. CouchDB, MongoDB, SQLite, SQL Server and other datbases all use either a b-tree or a b+ tree as their data indexes, so it was interesting to discuss b-tree properties.
Jun 6, 2013
Working on a long term svn branch
I work on a reasonably small team and for the most part everyone works in trunk. But it can happen where you need to switch over to a long term feature branch (more than a week or two) that can last sometimes months. The problem here is that your branch can easily diverge from trunk. If the intent is that the feature branch will eventually become the master (trunk) then you should merge the feature branch frequently. For me, this method has worked really well.
Jun 3, 2013
Building an ID3 decision tree
After following Mathias Brandewinder’s series on converting the python from “Machine Learning in Action” to F#, I decided I’d give the book a try myself. Brandewinder’s blog is great and he went through chapter by chapter working through F# conversions. If you followed his series, this won’t be anything new. Still, I decided to do the same thing as a way to solidify the concepts for myself, and in order to differentiate my posts I am reworking the python code into C#. For the impatient, the full source is available at my github.
May 30, 2013
Tech Talk: Sorting of ratings
Today’s tech talk discussed different ways to sort ratings system. The topic revolved around a blog post we discovered a while ago breaking down different problems with star based sorts.
May 27, 2013
Byte arrays, typed values, binary reader, and fwrite
I was trying to read a binary file created from a native app using the C# BinaryReader class but kept getting weird numbers. When I checked the hex in visual studio I saw that the bytes were backwards from what I expected, indicating endianess issues. This threw me for a loop since I was writing the file from C++ on the same machine that I was reading the file in C# in. Also, I wasn’t sending any data over the network so I was a little confused. Endianess is usually an issue across machine architectures or over the network.
May 20, 2013
Why \d is slower than [0-9]
I learned an interesting thing today about regular expressions via this stackoverflow question. \d, commonly used as a shorthand for digits (which we usually think of as 0-9) actually checks against all valid unicode digits.
May 20, 2013
Minimizing the null ref with dynamic proxies
In a production application you frequently can find yourself working with objects that have a large accessor chain like
May 17, 2013
Bad image format "Invalid access to memory location"
Wow, two bad image format posts in one day. So, the previous post talked about debugging 64bit vs 32 bit assemblies. But after that was solved I ran into another issue. This time with the message:
May 16, 2013
Determining 64bit or 32 bit .NET assemblies
I work on a 64 bit machine but frequently deploy to 32 bit machines. The code I work on though has native hooks so I always need to deploy assembly entry points at 32 bit. This means I am usually paranoid about the build configuration. However, sometimes things slip up and a 64 bit dll gets sent out or an entrypoint is built with ANY CPU set. Usually this is caught on our continuous build server with some cryptic reason for a unit test that should be working is actually failing.
May 16, 2013
Streaming video to ios device with custom httphandler in asp.net
I ran into an interesting tidbit just now while trying to dynamically stream a video file using a custom http handler. The idea here is to bypass the static handler for a file so that I can perform authentication/preprocessing/etc when a user requests a video resource and I don’t have to expose a static folder with potentially sensitive resources.
May 13, 2013
Users by connections in SignalR
SignalR gives you events when users connect, disconnect, and reconnect, however the only identifying piece of information you have at this point is their connection ID. Unfortunately it’s not very practical to identify all your connected users strictly off their connectionIDs - usually you have some other identifier in your application (userID, email, etc).
May 10, 2013
Tech Talk: Path finding algorithms
Today’s tech talk was about path finding algorithms. The topic was picked because of a recent linked shared to reddit that visualized different algorithms. The neat thing about the link is that you can really see how different algorithms and heuristics modify the route.
May 6, 2013
Building better regular expressions
Every software developer has at one point in time heard the adage
May 5, 2013
The largest mass problem
I was recently asked to write some code to find the largest contiguous group of synonymous elements in a two dimensional array. The idea is that you want to find the largest “land mass” in a problem where you have a game board that looks something like
May 3, 2013
Capturing union values with fparsec
I just started playing with fparsec which is a parser combinatorics library that lets you create chainable parsers to parse DSL’s. After having built my own parser, lexer, and interpreter, playing with other libraries is really fun, I like seeing how others have done it. Unlike my mutable parser written in C#, with FParsec the idea is that it will encapsulate the underlying stream state and result into a parser object. Since F# is mostly immutable, this is how the underlying modified stream state gets captured and passed as a new stream to the next parser. I actually like this kind of workflow since you don’t need to create a grammar which is parsed and creates code for you (which is what ANTLR does). There’s something very appealing to have it be dynamic.
May 2, 2013
Tech Talk: AngularJS
Today’s tech talk was a continuation on front-end discussions we’re having. Last week we talked about typescript (I forgot to write it up) and this week we discussed the basics of angular. Angular is a front-end MVC framework written by google that, at first glance, looks completely different from previous javascript/html development. The basic gist is to strongly decouple logic into encapsulated modules. But that’s not all there is, there’s a lot to it. Angular has a templating engine, dependency injection, double bindings between views and controllers, event dispatching, etc.
May 1, 2013
Debugging Serialization Exception: The constructor to deserialize an object was not found.
Today I was debugging an exception that was occuring when remoting a data object between two .NET processes. I kept getting
Apr 29, 2013
Separation of concerns in node.js
I’ve been playing with typescript and node.js and I wanted to talk a little about how I’ve broken up my app source. It’s always good to modularize an application into smaller bits, and while node lets you do a lot, quickly, with just a little bit of code, as your application grows you really can’t put all your logic in one big app.ts.
Apr 24, 2013
Images, memory leaks, GDI+, and the aggregate function
I ran into a neat C# memory leak today that I wanted to share. It’s not often you get a clear undeniable leak in C# and so I really had fun figuring this one out.
Apr 22, 2013
A response to "Ten reasons to not use a functional programming language"
If you haven’t read the top ten reasons to not use a functional programming language, I think you should. It’s a well written post and ironically debunks a lot of the major trepidations people have with functional languages.
Apr 18, 2013
Tech Talk: Text Editors
Today’s tech talk was a little less tech but no less important. We got together and talked about the different text editors that we use and why we like them.
Apr 15, 2013
Command pattern with SignalR
I’m using SignalR as a long poll mechanism between multiple .NET clients because part of my projects requirements is to have everything over http/https. There’s no point in rolling my own socket based long poll since SignalR has already done all the heavy lifting. Unfortunately since the app I work on is distributed I can’t upgrade my SignalR version from what I have (0.5.2) since the newer SignalR versions aren’t backwards compatabile. This means I have to make do with what this version of SignalR gives me.
Apr 10, 2013
Jon Skeet, C#, and Resharper
Today, at 1pm EST, the venerable Jon Skeet had a goto meeting webinar sponsored by JetBrains reviewing weird and cool stuff about C# and Resharper. For those who aren’t in the know, Resharper is a static analysis tool for C# that is pretty much the best thing ever. Skeet’s a great speaker and my entire team at work and I watched the webinar in our conference room while eating lunch.
Apr 8, 2013
Capturing mutables in f#
I was talking about F# with a coworker recently and we were discussing the merits of a stateless system. Both of us really like the enforcement of having to inject state, and when necessary, returning a new modified copy of state. Functional languages want you to work with this pattern, but like with all things software, it’s good to be able to break the rules. This is one of the things I like about F#, you can create mutables and do work imperatively if you need to.
Apr 5, 2013
Tech talk: Hacking droid
Todays tech talk was based off of a blog entry posted by facebook recently where they described the things they needed to do to get their mobile app running on android OS Froyo (v 2.2).
Apr 4, 2013
Advice to young engineers
I had the opportunity to represent the company I work for at an engineering networking event at the University of Maryland today catered to young engineering students of all disciplines. The basic idea was to be available for students to ask questions they don’t normally get to ask of working professionals such as “what’s the day to day like?” [lots of coffee, followed by coding all day], “what advice would you give to someone looking to get into xyz field”, etc.
Apr 1, 2013
Flyweight Locking
Locking is a necessary aspect of multithreading code: it prevents unpredictable behavior and makes sure code that is expected to run synchronously does so. Some situations can leverage lockless code, but not always. When you do need to do a lock you shouldn’t do it carelessly, if you lock a section of code that does some major work (such as database access) and it blocks other pending calls you need to be cognizant that there could be a delay or bottleneck. However, just because we have to lock doesn’t mean we can’t do some simple optimizations depending on what our business logic is. If we only need to lock items per a defined group then we can leverage flyweight locking. Lets go through an example to make this scenario clearer.
Mar 25, 2013
Mongoose with TypeScript
Mongoose is a library for node.js that wraps the mongoDB driver. Since I’ve been playing with typescript, I wanted to show a short demo of strongly typing mongoose with unit tests run in nodeunit all using typescript.
Mar 22, 2013
Tech talk: Javascript Memory Leaks and JSWhiz
Todays tech talk revolved around the recently published JSWhiz whitepaper from google. The paper discusses common javascript memory leak patterns. It also goes over how those leaks can be created and how google automated detection of them using Closure type annotations.
Mar 19, 2013
Merging two immutable dictionaries in F#
If you ever need to merge two immutable dictionaries (maps) that may share the same key, here is how I did it
Mar 18, 2013
When to abort a thread
When is it OK to abort a thread is a question that comes up every so often. Usually everyone jumps on the bandwagon that you should never ever do a thread abort, but I don’t agree. Certainly there are times when it’s valid and if you understand what you are doing then it’s ok to use.
Mar 11, 2013
Implementing partial functions
This next section I had a lot of fun with, and originally I didn’t plan on implementing it at all. The only reason I did it is because I had a stroke of genius while in the shower one morning. Today, I’m going to talk about how I supported partial functions in my toy programming language.
Mar 9, 2013
Tech talk: Service stack
Today’s tech talk the team and I talked about ServiceStack. I’ve heard a lot of hype about it but never really understood what it did or was about. Today, unfortunately, didn’t really clear any of that up.
Mar 8, 2013
Fixing "Calling LoadLibraryEx on ISAPI filter v4.0.30319 aspnet_filter.dll failed"
[code wraplines=”true”]Calling LoadLibraryEx on ISAPI filter “C:\Windows\Microsoft.NET\Framework\v4.0.30319\aspnet_filter.dll” failed[/code]
Mar 7, 2013
Adding static typing and scope references, part 3: solving forward references
In an earlier post I gave a brief overview of the scope builder and its jobs. There I mentioned that supporting forward references required some extra work. In this post I’ll talk more about how I solved forward references.
Mar 6, 2013
Just another brainfuck interpreter
Why?

Honestly, why not?

The entry point

Not much to tell:

```csharp
static void Main(string[] args)
{
var parser = new Parser("++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.-------- .\>+.\>.");
Mar 6, 2013
Add scheduled task and run even if on battery power
Just wanted to share a little helpful snippet in case anyone needs it. To add a scheduled task and make sure it starts even when on battery power do this:
Mar 5, 2013
Adding static typing and scope validation, part 2: type inference and validation
This post continues my series describing how I solved certain problems while creating a toy programming language. Today I’ll discuss static typing and type inference.
Mar 5, 2013
Double encoding: URI and HTML encoding
URL’s have specific characters that are special, like % and & that if you need to use as part of your GET URI then you need to encode them. For example:
Mar 4, 2013
Adding static typing and scope validation into the language, part 1
Continuing on my series discussing the language I wrote, this next post is going to talk about the basics of static typing and scope rules. So far my language implementation follows very closely to Parr’s examples in his book Language Implementation Patterns, which is what gave me the inspiration to do this project.
Mar 3, 2013
Configure all the things
I personally think that just about everything should be configurable, unless it’s absolutely never going to change. Even then, make it configurable, because it may change in the future. Think about your favorite command line tools, and the extensibility they have. They’re powerful because they are dynamic. They can be configured for a myriad of options and scenarios.
Mar 1, 2013
A handrolled language parser
In my previous post about building a custom lexer I mentioned that, for educational purposes, I created a simple toy programming language (still unnamed). There, I talked about building a tokenizer and lexer from scratch. In this post I’ll discuss building a parser that is responsible for generating an abstract syntax tree (AST) for my language. This syntax tree can then be passed to other language components such as a scope and type resolver, and finally an interpreter.
Mar 1, 2013
Tech talk: Bloom Filters
Each Thursday at work my team and I do a 45 minute to an hour discussion on any technical subject that we find interesting. We call these Thursday get togethers tech talks and I think they are awesome. We’ve been doing them for years and I’m hoping to start reposting our subjects and a blurb about our discussions each week after they happen.
Feb 27, 2013
Event emitters with success and fail methods for node.js
When it comes to node.js you hear a lot of hype, good and bad, so I’ve finally decided to take the plunge and investigate for myself what the fuss is about. So far it’s been interesting.
Feb 26, 2013
Building a custom lexer
As a software engineer I spend all day (hopefully) writing code. I love code and I love that there are languages that help me solve problems and create solutions. But as an engineer I always want to know more about the tools I work with so I recently picked up “Language Implementation Patterns” by Terence Parr and decided I was going to learn how to build a language. After reading through most of the book and working on examples for about 5 weeks I ended up building an interpreted toy general purpose language that has features like:
Feb 11, 2013
Thread Synchronization With Aspects
This article was originally published at tech.blinemedical.com
Jan 29, 2013
IxD 2013: Rhythm, Flow, and Style
This article was originally published at tech.blinemedical.com
Jan 28, 2013
IxD 2013 - Production ready CSS workshop
This article was originally published at tech.blinemedical.com
Jan 28, 2013
K-Means Step by Step in F#
This article was originally published at tech.blinemedical.com
Jan 18, 2013
Tracing computation expressions
This article was originally published at tech.blinemedical.com
Dec 21, 2012
Reading input in F#
This article was originally published at tech.blinemedical.com
Dec 14, 2012
Debugging piped operations in F#
This article was originally published at tech.blinemedical.com
Dec 6, 2012
RESTful web endpoints on Netduino Plus
This article was originally published at tech.blinemedical.com
Nov 23, 2012
Async producer/consumer the easy way
This article was originally published at tech.blinemedical.com
Nov 10, 2012
Dropped packets with promiscuous raw sockets and winsock
This article was originally published at tech.blinemedical.com
Oct 30, 2012
Run with real data
This article was originally published at tech.blinemedical.com
Oct 12, 2012
Inter process locking
This article was originally published at tech.blinemedical.com
Sep 4, 2012
A collection of simple AS3 string helpers
This article was originally published at tech.blinemedical.com
Aug 21, 2012
Handle reconnections to signalR host
This article was originally published at tech.blinemedical.com

Posts

Conditional probability

Bayes Formula

Conditional Independence