bjorg

Wednesday, November 20, 2013

Solving acoustics problems

A "waterfall plot" like this one is one of many tools used by
acousticians to determine the problems with a room.
Photo from realtraps which provides high quality bass traps,
an important type of acoustic treatment.
I recently received the following letter (edited):
      

Greetings,

The echo in my local church is really bad.  I am lucky if I can understand 10% of what’s being said.   I have checked with other members of the congregation and without exception they all have the same problem.

The church is medium size with high vaulted ceiling, very large windows with pillars spaced throughout.  The floor is mostly wood.   The speakers are flat against the side walls, spaced approx 15 metres apart and approx 10 feet above the floor.

The speakers are apparently ‘top of the range’… I just wonder if a graphic equalizer was used between the microphone and speaker, would this ‘clean up’ the sound a little?

I know that lining the walls with acoustic tiles and carpeting the floor would lessen the echo, but, we don’t want to do that if we can avoid it.

With regard to putting carpet on the floor, my thoughts are that instead of sound being absorbed by the carpet, the congregation present would absorb just as much as the carpet?.  One other theory I have is regarding the speakers.

If  the speakers were moved…

Michael

Hey Michael,

I sympathize with you. Going to service every week and not being able to understand what is being said must be very frustrating. While this is not the kind of thing I do every day, I do have some training  in this area and will do my best to give you something helpful.

Most churches are built with little attention to acoustics and old churches were built before there was any understanding of what acoustics is. With all those reflective surfaces and no care taken to prevent the acoustic problems that they create, problems are inevitable, and sometimes, such as in your church, they are simply out of hand. In a situation like that, even a great sound-system won't be able to solve the problem.

I recommend you hire a professional in your area to come look at the space and be able to give some more specific feedback. To have them improve the situation may cost anywhere from hundreds to tens of thousands of dollars (or even more) depending on the cause of problem. However, it's helpful to have some idea of what some of the solutions are so that when you hire that professional you are prepared for what's to come. You might be able to do some more research and take a stab at solving these issues yourself.

For example, it might be useful to listen to room and conjecture, even without measurements, if the problem is bound to specific frequencies or if it's just a problem of too many echos. If you are a trained listener you might be able to stand in the room in various places, clap loudly and listen to get a sense of this. Although even a trained listener would never substitute such methods for actual measurements, I often find this method useful for developing a hypothesis (eg. I might listen and say "I believe there is a problem in the low frequencies" before measuring. Then use measurements to confirm or reject this hypothesis). Also, look at the room, are there lots of parallel walls? If so, you are likely suffering from problems at specific frequencies and it's possible that a targeted, and probably less expensive, approach will help.

Another thing you can do is find someone with some stage acting experience and have them speak loud and clear at the pulpit. Have them do this both with and without the sound system and listen to the results. If they sound much clearer without the sound-system than with the sound-system, then that suggests that your sound-system may be causing at least some of the problems.

If you can't afford an acoustician, but you are willing to experiment a bit, this kind of testing might lead you to something. For example, maybe you notice some large open parallel walls and you agree that covering one or both of them with some heavy draperies is either acceptable or would look nice. You could try it and see if it helps. It's no guarantee, but it might make a difference. Draperies are, of course, unlikely to make that much difference by themselves, so you might consider putting acoustic absorbing material behind them.

Be warned, however, that acoustic treatments done by amateurs without measurements are often beset with problems. For example, you may reduce the overall reverberation time, but leave lots of long echos at certain frequencies. This can be yield results that are no better than where you started -- possibly even worse (although in your case I think that's unlikely).

Here are the types of things a professional is likely to recommend. You've already alluded to all of them, but I'll repeat them with some more detail. I put them roughly in order of how likely they are to help, but it does depend on your specific situation:
  • Acoustic treatments. Churches like the one you describe are notorious for highly reflective surfaces like stone and glass, and as you surmised, adding absorptive materials to the walls, floors and ceiling will reduce the echo significantly. Also as you surmised, floor covering may be of limited effectiveness since people do also absorb and diffuse sound, but, of course, it depends on how much of the floor they cover and where. I understand your hesitation to go this route since it may impact the aesthetics of the church, and it may be expensive, but, as I mentioned above, depending on the specific situation, you may be able to achieve a dramatic result in acoustics with relatively little visual impact, and depending on the treatment needed you may be able to keep your costs controlled. You should also be able to collaborate with someone who can create acoustic treatments that are either not noticeable or enhance the esthetics of your space. (Of course, you'll also need someone familiar with things like local fire codes!)
  • Adjusting the speakers. It's certainly possible that putting the speakers in another location would help. If they were hung by a contractor or someone who did not take acoustics into account, they are likely to be placed poorly. Location matters more than the quality of the speakers themselves. Also, if the speakers are not in one cluster at the front, adding the appropriate delay to each set of speaker may help to ensure that sound arrives "coherently" from all speakers, which can improve intelligibility significantly. Devices to provide this kind of delay, and lots of other features, are sold under various names such as "speaker processors," and "speaker array controllers," etc.
  • Electronic tools. Although this is likely to be least effective, you can usually achieve some improvement with EQ, as you suggested. For permanent installations, I prefer parametric EQs, but a high quality graphic will also work. An ad-hoc technique for setting the EQ is to increase the gain until you hear feedback, and then notch out the EQ frequency that causes the feedback. Continue increasing the gain until you are happy with the results. You must be very careful to protect your speakers and your hearing when using this technique, both of which can be easily damaged if you don't know what you are doing. Most speaker processors have built-in parametric EQs and some even come with a calibrated mike that you can use with the device to adjust the settings for you automatically. I've done this, and it works great, especially with a little manual tweaking, but you do have to know what you are doing. But, of course, you can't work miracles in a bad room.

Saturday, September 21, 2013

Mapping Parameters


Visualizing a Linear Mapping
Very often we need to "map" one set of values to another. For example, if we have a slider that ranges from 0 to 1, and we want to use it to control the value of a frequency setting. Or perhaps we have the output of a sine wave (which ranges from -1 to 1) and we want to use that to control the intensity of a EQ. In these cases and many more, we can use a linear mapping to get from one range of values to another.

A linear mapping is simply a linear equation, such as y = mx + b, that takes an input, your slider value for example, and gives you back an output. The input is x, and the output is y. The trick is to find the values of m and b.

Let's take a concrete example. Let's say you have the output of a sine wave (say from an LFO) that oscillates between -1 and 1. Now we want to use those values to control a frequency setting from 200 to 2000. In this case, x from the equation above represents the oscillator, and y represents the frequency setting.

We know two things: we want x=-1 to map to y=200, and x=1 to map to y=2000. Since our original equation, y = mx + b, had two unknowns (m and b), we can solve it:

Original equation with both unknowns:
y = mx + b

Substituting our known values for x and y:
200 = (-1)m + b
2000 = (1)m + b

Solving for b:
2200 = 2b
1100 = b

Solving for m:
2000 = m + 1100
900 = m

Final equation:
y = 900x + 1100

You can check the final equation by substituting -1 and 1 for x and making sure you get 200 and 2000 respectively for y.

So in our LFO/frequency example, we would take our LFO value, say .75, and use that as x. Then plug that value into the formula (y=900(.75) + 1100=1775) and get our final value for our frequency setting.

Wednesday, August 28, 2013

How to filter Miley Cyrus out of your facebook

This is quite possibly the most important post I have ever made. Here are specific instructions for filtering out every post in your facebook feed that contains the phrase "Miley Cyrus":


  • Download and install the browser extension called Social Fixer for your browser. It works for just about every browser except IE.
  • Depending on which browser you use, you might have to restart your browser. For Chrome, which I use most of the time, you just need to refresh your facebook page.
  • Social fixer gives a little intro page when you open facebook. If you don't see this, it's not installed correctly. Go through it in case there's something else you want to do with it.
  • Click the little wrench icon that now appears on the facebook page, and select "edit social fixer options." Then select "Filtering" (1). Then under "Other:Matching Text" enter "miley cyrus" (2), and under "Action" select "hide" (3), just like in the picture.
  • Click Save (4), and refresh your facebook feed. It should now be Miley Cyrus free.

Sunday, July 21, 2013

Peak Meters, dBFS and Headroom

The level meter from audiofile engineering's
spectre program accurately shows peak values
in dBFS
Level meters are one of the most basic features of digital audio software. In software, they are very often implemented as peak meters, which are designed to track the maximum amplitude of the signal. Other kinds of meters, such as VU meters, are often simulations of analog meters. Loudness meters, which attempt to estimate our perception of volume rather than volume itself, are also becoming increasingly common. You may also come across RMS and average meters. In this post, I'm only going to talk about peak meters.

Peak Meters

Peak meters are useful in digital audio because they show the user information that is closely associated with the limits of the medium and because they are efficient and easy to implement. Under normal circumstances, we can expect peak meters to correspond pretty well with our perception of volume, but not perfectly. The general expectation users have when looking at peak meters is that if a signal goes above a certain level at some point, that level should be indicated on the meters. In other words, if the signal goes as high as, say -2 dBFS, over some time period, then someone watching the peak meter during that time will see the meter hit the -2 dBFS mark (see below for more on dBFS). Many peak meters have features such as "peak hold" specifically designed so that the user does not need to stare at the meter.

Beyond that, there are rarely any specifics. Some peak meters show their output linearly, some show their output in dB. Some use virtual LEDs, some a bar graph. In general, if there is a numeric readout or units associated with the meter, the unit should be dBFS.

Now that we know the basics of peak meters, let's figure out how to implement them.

Update Time

Peak meters should feel fast and responsive. However, they don't update instantly. In software, it is not uncommon to have audio samples run at 44100 samples per second while the display refreshes at only 75 times per second, so there is absolutely no point in showing the value of each sample (not to mention the fact that our eyes couldn't keep up). Clearly we need to figure out how to represent a large number of samples with only one value. For peak meters, we do this as follows:

  1. Figure out how often we want to update. For example, every 100 ms (.1s) is a good starting point, and will work well most of the time.
  2. Figure out how many samples we need to aggregate for each update. If we are sampling at 44100 Hz, a common rate, and want to update every .1s, we need N = 44100 * .1 = 4410 samples per update.
  3. Loop on blocks of size N. Find the peak in each block and display that peak. If the graphics system does not allow us to display a given peak, the next iteration should display the max of any undisplayed peaks.

Finding the Peak

Sound is created by air pressure swing both above
below the mean pressure.
Finding the peak of each block of N samples is the core of peak metering. To do so, we can't simply find the maximum value of all samples because sound waves contain not just peaks, but also troughs. If those troughs go further from the mean than the peaks, we will underestimate the peak.

The solution to this problem is simply to take the absolute value of each sample, and then find the max of those absolute values. In code, it would look something like this:



float max = 0;
for( int i=0; i<buf.size(); ++i ) {
   const float v = abs( buf[i] )
   if( v > max )
      max = v;
}

At the end of this loop, max is your peak value for that block, and you can display it on the meter, or, optionally, calculate its value in dBFS first.

Calculating dBFS or Headroom

(For a more complete and less "arm wavy" intro to decibels, try here or here.) The standard unit for measuring audio levels is the decibel or dB. But the dB by itself is something of an incomplete unit, because, loosely speaking, instead of telling you the amplitude of something, dB tells you the amplitude of something relative to something else. Therefore, to say something has an amplitude of 3dB is meaningless. Even saying it has an amplitude of 0dB is meaningless. You always need some point of reference. In digital audio, the standard point of reference is "Full Scale", ie, the maximum value that digital audio can take on without clipping. If you are representing your audio as a float, 0 dB is nominally calibrated to +/- 1.0. We call this scale dBFS. To convert the above max value (which is always positive because it comes from an absolute value) to dBFS use this formula:

dBFS = 20 * log10(max);

You may find it odd that the loudest a signal can normally be is 0 dBFS, but this is how it is. You may find it useful to think of dBFS as "headroom", ie, answering the question "how many dB can I add to the signal before it reaches the maximum?" (Headroom is actually equal to -dBFS, but I've often seen headroom labeled as dBFS when the context makes it clear.)

Thursday, May 30, 2013

The ABCs of PCM (Uncompressed) digital audio

Digital audio can be stored in a wide range of formats. If you are a developer interested in doing anything with audio, whether it's changing the volume, editing chunks out, looping, mixing, or adding reverb, you absolutely must understand the format you are working with. That doesn't mean you need to understand all the details of the file format, which is just a container for the audio which can be read by a library. It does mean you need to understand the data format you are working with. This blog post is designed to give you an introduction to working with audio data formats.

Compressed and Uncompressed Audio

Generally speaking, audio comes in two flavors: compressed and uncompressed. Compressed audio can further be subdivided into different kinds of compression: lossless, which preserves the original content exactly, and lossy which achieves more compression at the expense of degrading the audio. Of these, lossy is by far the most well known and includes MP3, AAC (used in iTunes), and Ogg Vorbis. Much information can be found online about the various kinds of lossy and lossless formats, so I won't go into more detail about compressed audio here, except to say that there are many kinds of compressed audio, each with many parameters.

Uncompressed PCM audio, on the other hand, is defined by two parameters: the sample rate and the bit-depth. Loosely speaking, the sample rate limits the maximum frequency that can be represented by the format, and the bit-depth determines the maximum dynamic range that can be represented by the format. You can think of bit-depth as determining how much noise there is compared to signal.

CD audio is uncompressed and uses a 44,100 Hz sample rate and 16 bit samples. What this means is that audio on a CD is represented by 44,100 separate measurements, or samples, taken per second. Each sample is stored as a 16-bit number. Audio recorded in studios often use a bit depth of 24 bits and sometimes a higher sample rate.

WAV and AIFF files support both compressed and uncompressed formats, but are so rarely used with compressed audio that these formats have become synonymous with uncompressed audio. The most common WAV files use the same parameters as CD audio: 44,100 Hz and bit depth of 16-bits, but other sample rates and bit depths are supported.

Converting From Compressed to Uncompressed Formats

As you probably already know, lots of audio in the world is stored in compressed formats like MP3. However, it's difficult to do any kind of meaningful processing on compressed audio. So, in order to change a compressed file, you must uncompress, process, and re-compress it. Every compression step results in degradation, so compressing it twice results in extra degradation. You can use lossless compression to avoid this, but the extra compression and decompression steps are likely to require a lot of CPU time, and the gains from compression will be relatively minor. For this reason, compressed audio is usually used for delivery and uncompressed audio is usually used in intermediate steps.

However, the reality is that sometimes we process compressed audio. Audiofiles and music producers may scoff, but sometimes that's life. For example, it you are working on mobile applications with limited storage space, telephony and VOIP applications with limited bandwidth, and web applications with many free users, you might find yourself need to store intermediate files in a compressed format. Usually the first step in processing compressed audio, like MP3, is to decompress it. This means converting the compressed format to PCM. Doing this involves a detailed understanding of the specific format. I recommend using a library such as libsoundfileffmpeg or lame for this step.

Uncompressed Audio

Most stored, uncompressed audio is 16-bit. Other bit depths, like 8 and 24 are also common and many other bit-depths exist. Ideally, intermediate audio would be stored in floating point format, as is supported by both WAV and AIFF formats, but the reality is that almost no one does this.

Because 16-bit is so common, let's use that as an example to understand how the data is formatted. 16-bit audio is usually stored as packed 16-bit signed integers. The integers may be big-endian (most common for AIFF) or little-endian (most common for WAV). If there are multiple channels, the channels are usually interleaved. For example, in stereo audio (which has two channels, left and right), you would have one 16-bit integer representing the left channel, followed by one 16-bit integer representing the right channel. These two samples represent the same time and the two together are sometimes called a sample frame or simply a frame.

Sample Frame 1:
Left MSB Left LSB Right MSB Right LSB
Sample Frame 2:
Left MSB Left LSB Right MSB Right LSB
2 sample frames of big-endian, 16-bit interleaved audio. Each box represents one 8-bit byte.

The above example shows 2 sample frames of big-endian, 16-bit interleaved audio. You can tell it's big-endian because the most significant byte (MSB) comes first. It's 16-bit because 2 8-bit bytes make up a single sample. It's interleaved because each left sample is followed by a corresponding right sample in the same frame.

In Java, and most C environments, a 16 bit signed integer is represented with the short datatype. Therefore, to read raw 16 bit data, you will usually want to get the data into an array of shorts. If you are only dealing with C, you can do your IO directly with short arrays, or simply use casting or type punning from a raw char array. In Java, you can use readShort() from DataInputStream.

To store 16-bit stereo interleaved audio in C, you might use a structure like this:

struct {
   short l;
   short r;
} stereo_sample_frame_t ;

or you might simply have an array of shorts:

short samples[];

In the latter case, you would just need to be aware that when you index an even number it's the left channel, and when you index an odd number it's the right channel. Iterating through all your data and finding the max on each channel would look something like this:

int sampleCount = ...//total number of samples = sample frames * channels
int frames = sampleCount / 2 ;
short samples[]; //filled in elsewhere

short maxl = 0;
short maxr = 0;
for( int i=0; i<SIZE; ++i )
   maxl = (short) MAX( maxl, abs( samples[2*i] ) );
   maxr = (short) MAX( maxr, abs( samples[r*i+1] ) );
}
printf( "Max left %d, Max right %d.", maxl, maxr );

Note how we find the absolute value of each sample. Usually when we are interested in the maximum, we are looking for the maximum deviation from zero, and we don't really care if it's positive or negative -- either way is going to sound equally loud.

Processing Raw Data

You may be able to do all the processing you need to do in the native format of the file. For example, once you have an array of shorts representing the data, you could divide each short by two to cut the volume in half:

int sampleCount; //total number of samples = sample frames * channels
short samples[]; //filled in elsewhere

for( int i=0; i
   samples[i] /= 2 ;
}


A few things to watch out for:

  • You must actually use the native format of the file or the proper conversion. You can't simply deal with the data as a stream of bytes. I've seen many questions on stack overflow where people make the mistake of dealing with 16-bit audio data byte-by-byte, even though each sample of 16-bit audio is composed of 2 bytes. This is like adding a multidigit number without the carry.
  • You must watch out for overflow. For example, when increasing the volume, be aware that some samples my end up out of range. You must ensure that all samples remain in the correct range for their datatype. The simplest way to handle this is with clipping (discussed below), which will result in some distortion, but is better than "wrap-around" that will happen otherwise. (the example above does not have to watch out for overflow because we are dividing not multiplying.)
  • Round-off error is virtually inevitable. If you are working in an integer format, eg 16-bit, it is almost impossible to deal with roundoff error. The effects of round-off will be minor but ugly. Eventually these errors will accumulate and be noticeable  The example above will definitely have problems with roundoff error.
As long as studio quality isn't your goal, however, you can mix, adjust volume and do a variety of other basic operations without needing to worry too much.

Converting and Using Floating Point Samples

If you need more powerful or flexible processing, you are probably going to want to convert your samples to floating point. Generally speaking, the nominal range used for audio when audio is represented as floating point numbers is [-1,1].

You don't have to abide by this convention. If you like, you can simply convert your raw data to float by casting:

short s = ... // raw data
float f = (float) s;

But if you have some files that are 16-bit and some that are 24-bit or 8-bit, you will end up with unexpected results:

char d1 = ... //data from 8-bit file
float f1 = (float) d1; // now in range [ -128, 127 ]
short d2 = ... //data from 16-bit file
float f2 = (float) d2; // now in range [ -32,768, 32,767 ]

It's hard to know how to use f1 and f2 together since their ranges are so different. For example, if you want to mix the two, you most likely won't be able to hear the 8-bit file. This is why we usually scale audio into the [-1,1] range.

There is much debate about the right constants to use when scaling your integers, but it's hard to go wrong with this:

int i = //data from n-bit file
float f = (float) i ;
f /= M;

where M is 2^(n-1). Now, f is guaranteed to be in the range [-1,1]. After you've done your processing, you'll usually want to convert back. To do so, use the same constant and check for out of range values:

float f  = // processed data
f *= M;
if( f < - M ) f = -2^(n-1);
if( f > M-1)  f = M-1;
i = (int) f;

Distortion and Noise

It's hard to avoid distortion and noise when processing audio. In fact, unless what you are doing is trivial or represents a special case, noise and/or distortion are inevitable. The key is to minimize it, but doing so is not easy. Broadly speaking, noise happens every time you are forced to round and distortion happens when you change values nonlinearly. We potentially created distortion in the code where we converted from a float to an integer with a range check, because any values outside the range boundary would have been treated differently than values inside the range boundary. The more of the signal is out of range the more distortion this will introduce. We created noise in the code where we lowered the volume because we introduced round-off error when we divided by two. We also introduce noise when we convert from floating point to integer. In fact, many mathematical operations will introduce noise.

Any time you are working with integers, you need to watch out for overflows. For example, the following code will mix two input signals represented as an array of shorts. We handle overflows in the same way we did above, by clipping:

short input1[] = ...//filled in elsewhere
short input2[] = ...//filled in elsewhere
// we are assuming input1 and input2 have size SIZE or greater
short output[ SIZE ];

for( int i=0; i<SIZE; ++i )
   int tmp = (int)input1[i] + (int)input2[i];
   if( tmp > SHRT_MAX ) tmp = SHRT_MAX;
   if( tmp < SHRT_MIN ) tmp = SHRT_MIN; 
   output[i] = tmp ;
}

If it so happens that the signal frequently "clips", then we will hear a lot of distortion. If we want to get rid of distortion altogether, we can eliminate it by dividing by 2. This will reduce the output volume and introduce some round-off noise, but will solve the distortion problem:

for( int i=0; i<SIZE; ++i )
   int tmp = (int)input1[i] + (int)input2[i];
   tmp /= 2;
   output[i] = tmp ;
}

Notes:

A few final notes:
  • For some reason, WAV files don't support signed 8-bit format, so when reading and writing WAV files, be aware that 8-bits means unsigned, but in virtually all other cases it's safe to assume integers are signed.
  • Always remember to swap the bytes if the native endian-ness doesn't match the file endian-ness. You'll have to do this again before writing.
  • When reducing the resolution of data (eg, casting from float to int; multiplying an integer by a non-integer, etc), you are introducing noise because you are throwing out data. It might seem as though this will not make much difference, but it turns out that for sampled data in a time-series (like audio) it has a surprising impact. This impact is small enough that for simple audio applications you probably don't need to worry, but for anything studio-quality you will want to understand something called dither, which is the only correct way to solve the problem.
  • You may have come across one of these unfortunate posts, which claims to have found a better way to mix two audio signals. Here's the thing: there is no secret, magical formula that allows you to mix two audio signals and keep them both at the same original volume, but have the mix still be within the same bounds. The correct formula for mixing two signals is the one I described. If volume is a problem, you can either turn up the master volume control on your computer/phone/amplifier/whatever or use some kind of processing like a limiter, which will also degrade your signal, but not as badly as the formula in that post, which produces a terrible kind of distortion (ring modulation).

Sunday, April 28, 2013

Your technical co-founder as a partner, not a builder

One of the things I complained about in my last post was would-be company founders who are looking for nerds to build their product based on a specific idea or vision they've already created. Obviously, there's nothing wrong with hiring someone to build a product the way you want -- I build products for people all the time based on what they had in mind (and, I should add, I find this a highly rewarding experience). So what's the problem?

The problem happens not when you hire someone to build something a for you, but when you view your cofounder and/or CTO as someone who is building something for you. Your technical cofounder is not a contractor or regular employee; rather they are part of the core team responsible for the direction of the company. As such, they need to be invested in the company strategy (both literally, with stock, and emotionally) and their role is to help determine the company strategy and define the company's product.

You may think you've already defined the product, you just need someone to build it. If that's really the case, then get yourself a contractor -- ain't no shame in that! Unless your just doing something like building a simple e-commerce website as the storefront for your existing business, though, it's probably not that simple. Building new technology involves risks and tradeoffs, and you need someone who understands those risks and tradeoffs and how those risks and tradeoffs will impact the company strategy. That's where the CTO comes in.

Ideally, of course, all your employees should feel empowered to have some say over their domain area, but when founding a company, it's especially important to have a CTO who is invested. The reason is simple: most engineers are happy to build a product to spec, but a CTO who is invested will make sure the spec is both technically viable and makes sense for the business. Your CTO is, therefore, not just another employee: they are your partner in nurturing your product in all stages of development from conception to completion. This may seem counterintuitive if you already have a great, multi-million dollar idea, wireframes and so on, but it so happens that unless a technical person has looked it over and vetted it thoroughly (and probably changed it a lot), your idea is far from the great, multimillion dollar idea you thought it was. Worst case: it's not even viable. (Obviously, proper vetting from a technical person is not enough to make it a great, multimillion dollar idea, either, but that's another conversation). To put this another way, having your CTO invested in your product vision from the beginning removes significant risk from your business model because they can anticipate problems and see better solutions.

When I do contract work, there is always a technical person as part of the team that hired me. Whether that person is actually a CTO or not, they understand the requirements and know how my work is going to fit in with the business model, and they understand what tradeoffs are being made when it comes time to make a major decision. Without a person like that, who can make decisions with your business' best interests in mind, you will not be able to make the right technical decisions for your business.

In short, the job of a contractor is to build a product to spec on time and on budget. The job of the technical co-founder is to make sure the that technical decisions are made with business' best interests in mind. Both are useful people, but don't confuse the two.

Friday, April 26, 2013

Why you haven't found a tech co-founder

As a nerd who has accomplished a thing or two in a field that many people think it would be sexy to "disrupt" (music technology), I have been asked by many people in the last few years to co-found companies with them. These offers range from serious and interesting to what I call "drive-by's": people with some idea or other that are looking for an engineer to build. So far, I have said no to everyone and am continuing on my own projects which are far more interesting and rewarding.

If you are looking for a tech co-founder, I am going to tell you why you've had a hard time finding people like me; why we've probably said no to you, and what you can do to change the situation. As is always the case in business when you are stuck, you need to understand the situation from the other person's perspective in order to get what you want. Our perspective may not be right, and I am going to say some cold, harsh stuff, but I think you will feel enlightened if stand in our shoes for a minute. For starters, I'm assuming you aren't making an amateur mistake.

Why nerds don't go looking for "non-technical" co-founders

It may or may not surprise you to discover that when nerds (or technical founders, or future-CTOs or whatever you want to call us) decide to start a company, we don't go looking for a "business" expert. Startup business just isn't that complex. It is time-consuming, but not complex. What is complex, usually, is the technology: you need to build something reliable, scalable, easy to use and awesome. Believe me when I say that's fucking hard. And you usually need to do that for way less money than a big company would spend. That's the kind of shit alpha-nerds live for. Rightly or wrongly, we don't go looking for someone with MBA to help us build a startup, especially if the people we meet with MBAs have less of clue about the market than we do (which is the case surprisingly often). I'm not saying there's nothing you can offer. What I'm saying is that if you do have something to offer, you need to prove it.

You might have nothing to offer

You might think you have a lot to offer, but nerds don't see it that way: we don't need non-nerds to found a company, start building product and get our first users. I know that the classic company founding team is two people: one "builder" and one "seller", presumably the nerd and the business expert. But if you look at some of the best tech companies, the "seller" was often actually a technical person who learned to sell and learned the basics of running a business. The classic tech company founding team is more accurately described as a super-nerd (the builder) and a regular nerd (the seller). But I know, you've still got something we don't: a brilliant idea! There's two problems with that: 1. we also have great ideas, and 2....

Your brilliant idea is worthless

I know it sounds cold when I say your idea is worthless, but look at the companies that have come out of the last few decades of startups. They have either been simple ideas (e-bay, amazon, etc), copycats (facebook), or the ideas have been technical (google, heroku, etc), and let's face it, only a nerd could come up with a technical idea. Moreover, unlike you, nerds know what ideas can and can't be built. Don't get me wrong, ideas are not easy, and they are great starting points, but startups don't win on ideas: they win on execution. The value of the idea is that if everyone is invested in it and understands it, the team can work together cohesively. There's no better way to sour your team's enthusiasm for a project than to have "them" build "your" idea, and there's no better way to turn off a potential tech co-founder than to say "I have a great idea for a company, I just need someone to build it" (yes, I've heard this. A lot.). Sometimes companies do get started this way, but it's a sure sign of an unhealthy company that will be lucky to continue to be functional. If you want to find a tech cofounder, develop a relationship with them first, and then build the idea with them. One other note about ideas is that they often change in the face of your first product release and its response from the market, so your super-brilliant idea is probably going to change anyway. (If that's news to you, read up on the lean startup methodology).

What can you do about it?

If you aren't friends with any nerds who can build a business with you, you are not out of luck. Some common suggestions I've read are:
  • Learn to code yourself. This might work if you are young and ambitious. It will certainly have the benefit of helping you speak the language, understanding technical problems, meeting other nerds and so on. I would certainly applaud the effort of learning to code, but let's face facts: you can't become your own technical cofounder unless your technical needs are very simple or you are willing to put serious time into it. There's more to being a CTO than "coding" and you are unlikely to get a deep understanding for technology by taking a crash course in Ruby.
  • Outsource for your prototype. This actually works sometimes, but it will cost you and it's not as easy as it sounds. Outsourcing (whether on- or off- shore) requires that someone on your team has a clue about what's going on technically. You really need to be in touch with the technical team every day. Make sure you understand their technologies and, more importantly, their methodologies. If that sounds excessive, talk to anybody who's tried outsourcing. Keep in mind that even if you go with this strategy, you'll usually need to find a CTO and have them prove themselves before anyone will give you significant funding, which means you are going to need a plan to pay for this before you even approach a VC.
To the extent that either of these solutions works, they work because they put you in a position where you can offer something real back to the nerd: a prototype that validates your business model. (I'm assuming, here, that you've built something with some degree of success). To that end, I think there are better solutions:
  • Build your connections. Trust me: nerds will look at your linkedin profile. If we don't see hundreds of contacts (including at least as many VCs, angles, and potential customers as we have) we won't keep talking to you. Why should we?
  • Accomplish something. What better way to prove to nerds that you are worth working with than to have done something successful in the past? If your resume is full of school and jobs rather than accomplishments, then you need to work on your resume. Try joining an early-stage startup, or helping a startup raise money. If you can do that, you will have awesome nerds knocking down your door. If you can't do that, at least try writing a successful blog, or running a meetup.

The bottom line

The bottom line here is that startups don't need average business people. They don't need people who specialize in ideas or "solving business problems." Startups need doers, and they need doers who work together. If you can't prove that you are a doer, then other doers will not be interested. Think of it this way: would an investor give you money? If not, then why should a highly accomplished nerd give you their time?