How Affiliate Marketing Works

From WikipediA:

Affiliate marketing is a type of performance-based marketing in which a business rewards one or more affiliates for each visitor or customer brought by the affiliate’s own marketing efforts. The industry has four core players: the merchant (also known as ‘retailer’ or ‘brand’), the network(that contains offers for the affiliate to choose from and also takes care of the payments), the publisher (also known as ‘the affiliate’), and the customer. The market has grown in complexity, resulting in the emergence of a secondary tier of players, including affiliate management agencies, super-affiliates and specialized third party vendors.


There are a number of participants in the affiliate marketing space.  To easily understand both where you fit and who you should work with, here is a simple overview of the ecosystem:

  • Merchant: The Merchant/Retailer/Brand is essentially the entity that is looking to acquire customers through affiliate marketing, and is paying, based on a fixed $ per lead/phone call, % of sale, per visits, etc, for a referral for a potential or actual customer.
  • Agency: As in traditional marketing, there are Affiliate Marketing agencies such as Gen 3 Marketing, Affiliate Traction, or OPM Pros.  They are traditional services organizations that perform a wide range of tasks, including: affiliate recruitment & relationship management, concept & design of advertising creatives, or promotion distribution.
  • Vendor:  A catch-all term for 3rd-party product or services to the affiliate industry, such as data feed transformations, analytics, or Email marketing.  A few examples include: GoldenCAN, Mercent, Google Analytics, or Omniture.
  • Network: The Middleman.  Affiliate networks provide both the technical infrastructure and the business logistics for affiliate marketing to operate.  Responsibilities include: 1) running and maintaining the tracking software that credits referrals from affiliates for actions, 2) recruitment for both merchants and affiliates, and 3) handling all of the financial details (tax forms & payments for affiliates and merchants).  Some examples include: Commission Junction, LinkShare, PepperJam, Affiliate Window, ShareASale, or ClixGalore.
  • Affiliate: The Publisher or Affiliate is the website that 1) finds customers, 2) qualifies customers, and 3) refers the customer to the Merchant.  Common forms of affiliates include CSE (Comparison Shopping Engines), Blogs or other Content Publishers, Mailing lists, or SEM/Display Advertisers.
  • Customer: This is the person that 1) The Affiliate finds, and then refers to the Merchant, 2) Buys products/services from the Merchant, and 3) Pays or subsidizes the whole affiliate marketing system.

Great Infographic of the Ecosystem


Now that we’ve covered the ecosystem participants, here’s a simple explanation of the economics behind each member.

  • Merchant: The merchant either buys or produces for $X dollars a product or service, and then sells it to the Customer for $Y dollars.  (Whether on an individual sale, or lifetime expected Gross Receipts/Gross Cost of Goods Sold basis for a customer).  The difference between the revenue ($Y) and the costs ($X) is the Gross Profit.  Subtract out the Cost of Customer Acquisition (the associated affiliate commissions), and you’re left with the Net Profit, which is what the Merchant ultimately makes, even if that’s a negative number.
    A quick note about Marketplace vs eCommerce Economics:

    • Marketplaces: Amazon, eBay,, and Sears all operate “marketplaces”, where multiple Merchants can upload a list of products available to sell & ship.  At these Marketplaces, the Customer interacts directly with the Marketplace (Amazon/eBay/etc) to find their product, make their purchase, and process payment.  Nearly all customer contact details/payment information/purchase history/etc remains with the Marketplace,  and all future communication with Customers is limited to the transaction details.  In this scenario, the Merchant, for all intents and purposes, is more similar to drop shipper than an store or seller, profitability is simply calculated by revenue received from the marketplace minus your costs of fulfillment.
    • eCommerce Stores: This is your traditional branded eCommerce store & shopping cart software.  Customers navigate directly to the Merchant website, find products, create an account, make a purchase, and the Merchant processes the order.  One distinct difference between the Marketplace model and this one, is the customer BELONGS to the Merchant.  The Merchant is responsible for the entire shopping experience, and usually has the opportunity to market (eMail, phone, snail-mail, etc) to the Customer in the future.  If acquiring Customers through Affiliate Marketing, the initial cost is often referred to as a Cost of Customer Acquisition, since unlike the Marketplace model, Merchants have the opportunity for future marketing interactions and generating repeat orders without the initial Cost of Customer Acquisition.
  • Agency: Generally, most traditional Marketing Agencies charge clients a % of marketing spend.  In keeping with tradition, so do most Affiliate Marketing Agencies.  For example, if an affiliate marketing program generates $10k in affiliate commissions, a 5% agency fee to the Merchant is $500.  Agencies also have monthly minimums that may be higher than the service fee %.
  • Vendor:  Most often vendors will charge the Merchant either a flat rate for a service ($5k/month) or a variable based upon the actual usage (e.g. $5/per feed).  They also may have a monthly minimum.
  • Network: Here’s where it gets tricky.
    • Network Access Fee/Setup Costs: These fees range for $0 – $10k as a one time payment.  One network described their setup fee as follows: “The set up fee buys you into a network that not only offers an unsurpassed level of service and experience, but we are also leading the affiliate market in terms of the technology we offer. Our technology aims to provide a more targeted variety of creative linking methods, the tracking of which encourages publishers to work with us and increases your conversion rate.”
    • Minimum Deposit Amount: Generally, the Merchant’s account with the network acts similar to a Debit Card/Checking account.  There is usually an account minimum deposit such that a Merchant pre-pays commissions to the Network, from to which the Network deducts affiliate commissions.  This minimum deposit generally ranges from $0 to $2k.
    • Transaction fees/Override Amount: This is the true revenue stream for most Networks.  Often called a Transaction Fee or Override Amount, these fees are typically a  percent of Affiliate Commissions charged to the Merchant for use of the Network services.  This fee is generally between 20%-30% of Affiliate Commission.  For example, assume that the Affiliate Commission is 10% and the Override Amount is 20%.  With a Sales Amount is $100, the Commission payout is $10 to the Affiliate and the network would charge 20% of the $10, or $2 for the Override Amount, with total charge of $12 to the Merchant.
    • License Fee/Monthly Charge: Some networks charge an additional fee for the ongoing tools and customer support burden.  One network described their License Fee as: “This monthly cost enables access to all the contact tools, tracking, account support services, real-time sales reports, invoicing & payment services and technical support.”  Generally this is between $0 and $500/month.
    • Monthly Minimum: There is, commonly, a monthly minimum for the Transaction fees/Override Amount.  For Example, If the Transaction fees are 20% and there was $10k of gross sales generated, with a 10% affiliate commission, the amount due to the Affiliate Network would be $200 ($10k gross sales * 10% affiliate commission * 20% network transaction fees = $200).  If the Monthly minimum is $250, then the greater of the two (calculated $200 vs minimum $250 is charged).  Generally this amount ranges from $0 – $500 depending on the network.
  • Affiliate: The affiliate or publisher earns the Cost Per Action or Commission as defined by the Merchant affiliate agreement.  This is often a percent of the commission sale, % of reoccurring subscription revenue, or a flat amount per signup, form completion, or order.
  • Customer: Again, the customer is ultimately looking for the Merchant’s product or service, and the Affiliate finds the customer and refers them to the Merchant, where they make a purchase and the Affiliate gets compensated for the referral.


After understanding the aspects of how all of the business relationships work, the next step is to understand the mechanics of the implementations.  There are two basic implementations, one involves Cookies and the other involves back-end API development.  For the purpose of simplicity, I’ll just cover how the Cookie Based approach works, which is also the most common.

  • The Affiliate Network will generate a unique URL for each Affiliate for each and every Merchant.  In an example Commission Junction link: the 3576795 represents the Affiliate, and the 10746577 represents both the Merchant and a specific advertising campaign the is being run.
  • A Customer will click/navigate to that URL and have a Cookie set saying the Affiliate 3576795 referred that customer to the Merchant.  The AdId (10746577) is also associated with a destination page (in this case to which the customer is redirected to.
  • The customer performs product searches, adds an item to the Shopping Cart, and continues to check out.
  • On the “Thank You” page, the Merchant has a pre-installed 1×1 transparent gif (to the network) that has an URL parameter such as productSales=100.00 that the Affiliate Network then uses to decide between whether 1) the purchase was made by a referral from a pre-existing Cookie for Affiliate 3576795 and then credits the account for the commission, or 2) doesn’t see a cookie and ignores the purchase.
Affiliate Marketing Workflow

Affiliate Marketing Workflow

Real Life

When merchants ask me about affiliate networks, they often want to know the details.  I can’t know perfectly but here are the fees as I understand them:

Network’s Name
Setup Fee
Transaction Fee
Affiliate Future $1,000 $0 30% of commission $100/mo access fee
3% of sale’s total
Affiliate Window $5,000  Unknown 30% of commission $500 monthly min
Commission Junction $3,000 30% of commission $500 monthly min
Google Affiliate Network (closing down)
varies ($1,000-$5,000)
25% of commission $500 monthly min
LinkShare $1,000 $3,000 2-3% of sale’s total $500 monthly min
Pepperjam no data
ShareResults $500 $500 25% of commission
20% of commission
$25 monthly minimum
25% of commission
$0 monthly minimum

I created an example Google Document with the simple calculations of Commission Rate Scenarios + Affiliate Network Rates:

Author Note: I consider this post a draft, and will need to add a lot more for it to be anywhere near comprehensive.  5/6 Edit: Some verbiage updates & inclusion of Marketplace vs eCommerce Store economics.


Scala @ Frugal Mechanic

Last night Tim gave a talk about Scala at the Seattle Tech Startups meeting.

From the Scala website:

Scala is a modern multi-paradigm programming language designed to express common programming patterns in a concise, elegant, and type-safe way.  It smoothly integrates features of object-oriented and functional languages.

At Frugal Mechanic we use Scala for all of our backend processing and for about 50% of our front end (the other 50% is Ruby on Rails).  If you’re interesting in learning more about Scala here are some resources to get you going:

The slides from his talk are available here or from SlideShare:

A Brief Intro to Scala

View more presentations from tpunder.

Building a Scrabble MMO in 48 hours


Last weekend, I was lucky enough to be able to participate in the first Node Knockout. The rules are simple: you have 48 hours to code someone amazing using the Node.js platform. Planning ahead is allowed, however it has to be all on paper (no digital assets, designs, etc)

I’m not sure what it is about these weekend coding competitions, but they’ve always fascinated me. I love seeing real life demonstrations of the fact that technology has progressed to the point where you can go from a harebrained idea to deployed and functional product in a few intense dev days. Or maybe it’s because I remember my previous life as an ASP.NET developer where the first week of a project would barely cover the schema design and data access layer. Either way, it’s great for the Node.js community. If you want to convince the world that a language/framework is worth using, the best way is use it to build amazing stuff, amazingly fast.

Having also competed as a solo team in Rails Rumble the last few rounds with decent results, I had a good idea of what to expect. This time I was able to team up with some really talented local developers I met at MongoSeattle a few months ago: Grant Goodale, Damon Cortesi, and Aviel Ginzburg. I figured with a team of 4, we could really bite off something ambitious and might have a chance at actually completing it.

The Final Product

Scrabbly board

What we were able to build was a Massively Multiplayer Online Scrabble game. Tiles are placed in realtime, and each player competes with everyone else to build the longest chain of words and the highest score. Once you place a tile, it’s on the board forever and other players have to work around it. You can start your game on anyone’s tiles, but after you place your first word you will need to stick to your own tiles until you run out of lives (swapping tiles loses a life).

The Tools


As soon as we came up with the idea, we knew exactly the right tool for this application. MongoDB geospacial indexing was a lifesaver here. We are able to store every tile thats placed on the board by its x,y coordinate and query a bounding box whenever someone changes their view. The resulting queries are ridiculously fast, even with a tile collection approaching over 90 thousand. Once we get too big for that, scaling is going to be dead simple given we can just use modular arithmetic to produce a shard key.


For the web server, we ended up using ExpressJS since it seemed like the most mature out of the NodeJS web frameworks. As a Ruby/Rails developer (no real node/expressjs experience before), I mostly avoided this aspect of the application, and just worked with static html files, javascript, and css. From what I did use of ExpressJS, it felt very young. Works well if you do everything exactly right and are willing to do some trial and error, but don’t expect edge cases to work, or any kind of comprehensive documentation. Just the simple act of setting cookies, or performing a redirect took dev effort. Node’s philosophy seems to favor that you avoid abstracting too much. This was frustrating to begin with but I soon realized that I really should be knowing the internals of these mechanisms anyways. Using node will force you to become an expert in all things HTTP, which is not a bad thing at all.


For the UI end of things, I figured I’d just stick to what I know. JQuery and JQuery UI handled the job just fine. I do have some issues with the codebase of JQuery UI, and it’s not very hackable, but it’s widely used and well documented. I felt I had to compromise a lot on the usability however and would have loved to rewrite the sortable component (for the Tile Rack) myself in order to make it work exactly as I wanted. But that just wasn’t possible in the timeframe we had.


Hosting wise we used Heroku (with pre-alpha node support). MongoHQ handled the mongodb hosting for us. It all worked like a charm. We had no issues whatsoever while in development, and everything deployed smooth as butter. After we started getting hammered with traffic, the Heroku node does tend to go down a few times a day, requiring a restart but I’m sure they’ll get that fixed up once they hit their actual release. In the meantime, after the competition we’ll probably move our servers elsewhere.

The idea meetup (2 weeks before NodeKO)

Hosting wise we used Heroku (with pre-alpha node support). MongoHQ handled the mongodb hosting for us. It all worked like a charm. We had no issues whatsoever while in development, and everything deployed smooth as butter. After we started getting hammered with traffic, the Heroku node does tend to go down a few times a day, requiring a restart but I’m sure they’ll get that fixed up once they hit their actual release. In the meantime, after the competition we’ll probably move our servers elsewhere.

The idea meetup (2 weeks before NodeKO)

The first meeting we had was the idea session. I came prepared with my own idea (along with 3 pages of notes, use cases, and monetization strategies) for a variant on real time web chat. The team heard me out, but it promptly got rejected. It was too bland, and they figured everyone else would be doing it. And if we were going to do this for free in our spare time, they wanted to build something fun. I agreed as well, so we started thinking up game ideas.

As you can probably imagine, these were all over the place initially. Started off exploring the idea of building the card game “Set” (a real time card game). We decided we should build something everyone understands, and thought about doing an online Battleship. Then Scrabble. Then we started thinking about ways to combine them “You sank my Scrabbleship!”. After some additional debate, the idea of Scrabble on an infinite board started to seem plausible. An hour and a couple filled white boards later we had figured out what we wanted to do, along with some initial ideas for game mechanics.

We also came up with the name. We knew it wasn’t going to last, but seemed fun to use for the short term. Until the lawyers at Hasbro got wind of it of course.

The feasibility meetup (1 week before NodeKO)

After getting psyched up about the infinite Scrabble idea, we wanted to test out whether the game mechanics would actually work. What better way than to chop up and glue together 5 Scrabble boards. The result is pictured here:

Scrabble Boards

Sitting down and playing with 4 people together on one gigantic Scrabble board allowed us to tweak the rules a bit, and test if it would actually be fun to play in practice. We had a blast, which was proof enough to me that this thing was worth building.

Unfortunately we also found out that one of our team members (Damon) had an important client meeting the next weekend so we’d be down to 3 people on the project. This was a bummer, but we pressed on.

The UI meetup (1 day before NodeKO)

Right before the competition, we met up one more time in order to really nail down the final user interface design before the competiton started. This time we brought some graph paper and colored pencils then started drawing.





We were also able to snag meetings with a couple of the Seattle Techstar teams to walk them through the paper prototypes and get feedback on what made sense and what didn’t. This was incredibly useful to finalize all our intro copy, button labels, and the general user interaction of the site before the competition started. We now had reasonable confidence that most users would understand and be able to use our app once it was built.

Finally, before we left for the night, we came up with a gameplan for each of us on what tasks we need to do in what order.



We decided that if we were going to win this thing, we’d need to have one person (Aviel) fully focused on design for the entire duration of the competition. That turned out to be a great decision as it allowed us to attain a level of design polish and multiple iterations of design ideas that wouldn’t have been possible without a dedicated designer fully obsessing over every pixel.

The 48 hours of coding

Like all the other teams, we started as soon as the competition went underway (Friday at 5pm PST) and worked well into the night. Our team of 3 was able to work together at the Founder Coop offices in Seattle (where TechStars Seattle is located). We took over a conference room and made it our home for the weekend.

I stuck to my gameplan, went line by line, and got each UI feature functional. Grant worked solely on the backend data (mongodb and nodejs), and Aviel was able to focus on designing a gorgeous game board and tiles in Photoshop. Since our interface was just one screen, we were able to spend a lot of time on each little piece of it, and go through a few different design ideas.


After going all of Friday night into Saturday morning, we felt we made great progress. The infinite board was working, along with the dragging tiles back and forth between the rack and the board. And we had some nice initial tile and board design psds to integrate. After about 5 hours of sleep on Saturday, we started back up.


Adding Board Tiles and Basic Rack Panel

Adding Rack Tiles and Drag/Drop

Adding Rack Tiles and Drag/Drop

The Saturday night – Sunday morning was the toughest stretch of the competition by far. Low sleep, and lots of broken functionality definitely took a strain on motivation levels. After working through the night, on Sunday morning I was pretty much resigned to shipping a half broken piece of crap. The basic infinite board, tile placing, and design looked good, but the game didn’t work at all. We couldn’t validate placing a word onto the board, and we hadn’t even started on the push notifications. I toyed with the idea of cutting my losses and just going to sleep.


Adding Compass and Random Tile Generator


Pressing on, we found small wins here and there, and finally got the word placing to work. Integration with push notifications from the server was handled by a 3rd party web service called PusherApp, and it took all of about 15 minutes to get up and functional. After those 2 big features were in, I had the motivation to actually finish the thing. On Sunday from noon to 3pm, we basically built the entire game system within (end turn, score submission, restart game, etc). It was rapid fire. We went though all the missing features, spent 20 – 30 minutes coding each one, and on to the next. The rapidly approaching deadline meant whatever didn’t get done in a short window of development had to be cut.

Adding Play Buttons and Logo Panel

Adding Play Buttons and Logo Panel

Adding the Score Panel, the leaderboard, and the jGrowl Alerts. Lightened up the Board colors.

Adding the Score Panel, the leaderboard, and the jGrowl Alerts. Lightened up the Board colors.

At 3pm on Sunday we were able to focus the next 2 hours purely on bug fixes and polish. Adding an intro screen, putting in a leaderboard, and fixing a ton of bugs with resolving word positions. The frantic code, checkin, deploy, and swearing lasted right up until minutes before the deadline. Unfortunately some bugs caused us to have to strip out the leaderboard as one of our last checkins. But at that point I couldn’t really get anything working right as I was running on fumes and could barely think straight.

After the 5pm deadline, we took some shots of whisky, drank a bottle of wine, and called it a night.

To get a feel for how frantic the dev activity was, here’s the full git commit log: Near the end of the competition I wasn’t even coherent enough to spell the swear words correctly in the commit messages.

Final released version!

Final released version!

Traffic Stats

In the 3 days since we’ve launched, we’ve managed to get almost 7 thousands unique visitors. Over 90 thousand individual Scrabble tiles have been placed on the board so far. Feedback has been fairly positive aside from the fact that the board has gotten way too large and it’s difficult to scroll far enough to find a place to start playing. We plan to iterate with some of these fixes and release a new version on Friday after competition judging ends.

Why we do it

Over 200 teams signed up for Node Knockout, and around 100 produced final judge-able products. The question is, why do we do it?

I think what competitions like this give is a structured way to “prove your chops” to the world. Having a successful entry, even if it doesn’t win, gets huge exposure from important people in the tech sector. Having a concrete example of your abilities in a public forum can mean a huge difference between a low hourly rate and a high one, the difference between a dream job with high salary and a a mediocre one. The tangible benefits of competing successfully can easily equate to tens of thousands of dollars over the course of year for a worthy competitor.

Another great benefit is the motivation that comes with working against the best and the brightest in the industry. You know right away you can’t be slacking off and have a chance. Judging from the results, this seems to help bring out the best in people. The high caliber of the final submitted apps support that.

A Month Later

Just over a month after the competition, has hit 2 million placed tiles and over a hundred thousand unique visitors. Traffic is growing every day through word of mouth, and we’ve had to switch web hosts twice just to handle the load (we’re now on Blue Box Group, for both our web and MongoDB servers).

We never thought our Node Knockout side project would get the response it has so far, but we’ve decided to keep it running and see just how ridiculously huge this Scrabble board can get!


About Author: Jacques Crocker is a Ruby on Rails developer in Seattle.
Follow him on Twitter at @railsjedi

Whitelabel Analytics using Google Analytics

Whitelabel products can be seen anytime you walk into a grocery store.

Crispy_hexagons Whitelabel Product

Image Courtesy of Wikipedia

They’re becoming even more popular in online business models:

Vast Whitelabel


Most whitelabels operate by either providing a back-end API, or in having a DNS CNAME entry pointing to the host.  In the case, for when you’re powering a whitelabel through a CNAME entry, usually the partner wants to have a way to track their stats. An easy way for them to do this is to include their Google Analytics code and then have them setup a Custom Segment to isolate the traffic to the whitelabel site. That setup is easily accomplished through the following instructions:

Creating Custom Reports

If you are only interested in statistics for an exact Web site, following the steps above to drill-down to your site becomes a time consuming process. You can follow the steps below to create a custom report that is already focused on one or more URLs. In Google Analytics, you can create an “advanced segment” to narrow down a report to basically any criteria.

  1. Log into Google Analytics
  2. Click “Advanced Segments” under “My Customizations” on the left menu.
  3. Click “Create New Custom Segment” on the top right of the screen.
  4. In the “type to filter” search box on the left menu type in “hostname” as one word.
  5. Drag the green hostname box over to the center column and on top of the dotted box labeled “dimension or metric”.
  6. Change the condition to “contains” in the drop down box.
  7. In the value box, type in your full URL of your web site minus any slashes, http://, www or anything else. Example:
  8. Down at the bottom of the form you need to give your new segment a name, for example call it “Entomology site traffic”.
  9. Click the button “create segment” to the right of the name box.

You have now created a new segment, which will be available at any point in the future from only your account. This segment can be applied to your reports by following the instructions below.

Viewing Custom Reports (Segments)

  1. Log into Google Analytics
  2. From any report or the Dashboard click down arrow in the drop down window next to “advanced segments” on the top right of the screen.
  3. By default “All Visits” will be checked, which is the default segment that shows all traffic and all sites. Under “Custom Segments” you will see the segments you created.
  4. If you wish to only view your segment, check it in the “Custom Segments” box and un-check “All Visits”. If you leave “All Visits” checked, it will overlay the reports to allow comparisons.
  5. Once you have selected your segment and unchecked “All Visits” you can click “Apply” below the “Custom Segments” menu.

You are now viewing all of the reports that are narrowed to to only this specific segment or Web site. You can verify the segment is selected by looking above the graph on any report and it should say the name of the segment you created.

Howto create “standardized” CSV exports from MySQL

When you start querying/analyzing/and wanting to export data from MySQL (especially anything free-form like SEO/SEM Keyword Data Extracts) you’ll quickly find out you can’t simply concatenate your columns together and join them with a delimiter to be read later.

Sampling some of our keywords illustrates a number of problems:

Quotes: ! 1/4″ inlet muffler

Newlines: xtra vision

headlight h4703xv hours

Control Characters: ^HToyota+starlet+coolant+temp+sensor, 242°/250° duration @ .050″ 601 lift

Commas/Tabs/Semi Colans or other typical delimiters: “+bosch, +part no., +67647″, 2500hd

chevy cat back exhaust

Unicode: ايسوزو 20011

ב.מ.וו 545i, 宝马m3冷却系统

The Original MySQL attempt to export this was:

	SUM(`entrances`) as entrances
INTO OUTFILE 'keyword_data.csv'
	medium = 'organic'

This produced a really great file format, any crazy characters were escaped properly and it looked great on paper:

"organic","! 1/4\" inlet muffler","/cache.aspx?q=\"1+4\"+\"inlet+muffler\"&d=76309488474199&mkt=en-US&setlang=en-US&w=6d792683,38b10c05",1
"organic","!((^ toyota avalon mechanical lifter repairs","/auto-parts/toyota/engine-mechanical-valve-lifter",1
"organic","!(gl-508 rm) motorcraft","/auto-part/149373-motorcraft-gl508rm-remanufactured-alternator",1
"organic","!947 dodge power wagon","/auto-part/983439-mallory-ignition-wire-set-m11947",1
"organic","!967 ford mustang parts","/auto-parts/ford-mustang",1
"organic","!971 142 volvo specifications","/auto-parts/volvo-142/engine-electrical-distributor",1
"organic","!972 buick skylark heater core","/cars/1972-buick-skylark",1

I loaded up FasterCSV to start parsing the file and quickly figured out how naive I was about the CSV file formats. MySQL’s default use of \ as an escape character isn’t “standard.” It turns out there isn’t an official CSV standard but I did find something close:

Hmm so effectively none of the control characters should be escaped, but fields should be quoted if they have a delimeter in them, and a quote in the field should be replaced with “” to be a CSV Standard.

The other wrinkle is the standard calls for \r\n to be the line delimeter, but what happens if there are other \r\n in the columns?

First Step – Remove Escape Characters:


By adding the ESCAPED BY ”, none of the characters will be escaped in the output

Step Two – Replace \r\n with \n:

	REPLACE(`keyword`, "\r\n", "\n")

Step Three – Replace ” with “”

	REPLACE(REPLACE(`keyword`, "\r\n", "\n"), '"', '""')

Step Four – Replace NULL-Byte with a blank

	REPLACE(REPLACE(REPLACE(`keyword`, "\r\n", "\n"), '"', '""'), ""), char(0), "")

Step Five – Replace NULL field with a blank

	IFNULL(REPLACE(REPLACE(REPLACE(`keyword`, "\r\n", "\n"), '"', '""'), ""), char(0), ""), "")

Putting it all together:

 IFNULL(REPLACE(REPLACE(REPLACE(`medium`, "\r\n", "\n"), '"', '""'), char(0), ""), "") as medium,
 IFNULL(REPLACE(REPLACE(REPLACE(`keyword`, "\r\n", "\n"), '"', '""'), char(0), ""), "") as keyword,
 IFNULL(REPLACE(REPLACE(REPLACE(`landingpage`, "\r\n", "\n"), '"', '""'), char(0), ""), "") as landingpage,
 SUM(IFNULL(`entrances`, 0)) as entrances
INTO OUTFILE 'keyword_data_new.csv'
FROM `daily_medium_keyword_landingpage_visits`
WHERE medium = 'organic'
 IFNULL(REPLACE(REPLACE(REPLACE(`medium`, "\r\n", "\n"), '"', '""'), char(0), ""), ""),
 IFNULL(REPLACE(REPLACE(REPLACE(`keyword`, "\r\n", "\n"), '"', '""'), char(0), ""), ""),
 IFNULL(REPLACE(REPLACE(REPLACE(`landingpage`, "\r\n", "\n"), '"', '""'), char(0), ""), "");

Put Google AdSense Into the Corner, Let your page render first using jQuery

First off, if you haven’t installed both Yahoo YSlow! and Google Page Speed then bookmark this page and come back to it later. Chances are you have bigger fish to fry on the rest of your website.

Now if you read my post on AB Testing with Google Website Optimizer, you probably caught on that <script type=”text/javascript”> tags block rendering of content on your pages.

Now that you’re using these tools, your noticing some of these AdSense calls are taking 300ms to render – sometimes multiple SECONDS…wait but when I combine those two facts that means my content below the ads isn’t rendering yet…

WTF Google!

The good news, is, it’s possible to unblock the rendering of your page’s content. The bad news, is, it involves repositioning the ads after the page has been loaded (a little bit of a flicker).


The mechanics of the solution involve three pieces:

  1. A placeholder <div> where the ad should be displayed
  2. Shoving the actual ad elements as one of the last items in your HTML document (so everything else gets rendered first)
  3. jQuery code to reposition the ad back into its placeholder slot

Inline Content:

The container for your ads should be pretty straightforward, you know what size ad unit you’re including so block out the section of your page:

      <div id="fmad_placeholder_Detail_Page_Bottom" style="height: 90px; line-height: 90px;">&nbsp;</div>

Bottom Page Content:

Your moving div that actually calls AdSense/Ad Manager code:

<div id="fmads_Detail_Page_Bottom" style="display: none; text-align:center;">
  <script type="text/javascript">

jQuery Magic to reposition your ads back to where they should be:

<script type="text/javascript">
  var fmPositionGoogleAdsInit = false;
  function fmPositionGoogleAd(ad_name) {
    var fun = (function () {
      var placeholder = $('#fmad_placeholder_'+ad_name);
      var ad = $('#fmads_'+ad_name);

      if(placeholder && ad) {
        var pos = placeholder.offset();
          "display": "block",
          "position": "absolute",
          "left": pos.left+"px",
          "width": placeholder.width()

    if(!fmPositionGoogleAdsInit) {

    fmPositionGoogleAdsInit = true;

  function fmPositionGoogleAds() {
    // List all of your ad slots



Now, if you have any other dynamic content on the pages, you may have to manually reposition the ads again.  This is simply done by calling:


Hope that helps!

Ruby Script to Extract Google Analytics Keywords, Landing Page, Medium Entrances to CSV/TSV format

I’ve had versions of this script hacked together before.  In fact, if you search for “Google Analytics Data Extractor” the #1 result is a google groups posting I made in a former life for a C# program I developed:

There’s now a couple of ruby gems out there for handling the Google Analytics API

One of them I’ve been doing some light-contributing to, including the pagination support (automatically grab all results in 10,000 result chunks)

I pushed a couple of changes up today at: that hopefully will get merged into the main trunk soon (

One of the more useful uses of the library is to extract your Organic Search Results Keywords from Google Analytics. We’re using this at Frugal Mechanic to see how people search for Auto Parts.

require 'rubygems'
require 'gattica'
require 'fastercsv'

ga_profile = "" #Enter your Profile Here
start_date =, 1, 1)
end_date =, 12, 31)
file_path = "data" # Directory Needs to exist

puts "Google Username: "
u = gets.chomp
raise "bad username" unless !u.empty?
puts "Google Password: "
system "stty -echo"
pw = gets.chomp
system "stty echo"
raise "bad password" unless !pw.empty?

class ExtractKeywords
  def initialize(email,password)
    @gs ={:email => email, :password => password})

  def get_accounts
    results = []
      profile = {}
      profile[:site_title] = account.title
      profile[:profile_id] = account.profile_id ## this is the id required for requests to the API
      profile[:account_name] = account.account_name
      profile[:account_id] = account.account_id
      profile[:web_property_id] = account.web_property_id
      results << profile     }     return results   end   def connect_profile(profile)     @gs.profile_id = profile[:profile_id]   end   def connect_profile_id(profile_id)     @gs.profile_id = profile_id   end   def get_keywords(start_date = nil, end_date =     results = []     csv_data = @gs.get({         :start_date => (start_date || (end_date - 365)).to_s,
        :end_date => end_date.to_s,
        :dimensions => ["medium", "keyword", "landingPagePath"],
        :metrics => "entrances",
        :sort => "-entrances",
        :page => true}).to_csv(:long)

    return FasterCSV.parse(csv_data, :headers => true)


gs =, pw)

(start_date .. end_date).each { |date|
  file = "#{file_path}/medium_keyword_landingpage_visits_#{date.strftime('%Y-%m-%d')}.csv"
  next if File::exists?( file ), "w") do |csv|
    csv << ["medium", "keyword", "landingPagePath", "entrances"]
    keywords = gs.get_keywords(date, date)
      csv << [row["medium"], row["keyword"], row["landingPagePath"], row["entrances"]]


Cassandra @ Frugal Mechanic

Last night I gave at talk at Seattle Tech Startups Meetup on Cassandra and how we use it at Frugal Mechanic (although in a very High-Level and more of a case study on how we came to choose Cassandra for our specific needs. Thanks for the great turn-out and some great questions.

One of the most useful slides is my resource slide. Here’s a summary of all of the great talks/Cassandra slide decks I’ve come across:

NoSQL Cassandra Talk for Seattle Tech Startups 3-10-10

View more presentations from egpeters

Rails Plugin for Google Website Optimizer for AB Tests – Dynamic Content Experiments and No Flickering!

TL;DR: If you don’t care how this plugin works and just want to see/get the code, it’s available at github:

There’s a ton of reasons to reasons to invest in conversion rate increases.  If you’re spending $1 to acquire 10 visits, and you can get 2 instead of 1 to convert its like putting an extra dollar in your pocket.  A couple days ago Rand Fishkin from SEOMoz even said “2010 is the Year of Conversion Rate Optimization

Google Website Optimizer’s Homepage will give you a number of reasons to use their product My biggest reasons is its FREE!

Last March, we met with Eric Vasilik with the Google Website Optimizer Team in the Seattle Google Office about some hang-ups we had trying to A/B Test functionality and integrate Google Website Optimizer:

  1. Speed. The out-of-the-box Google Website Optimizer code does something like this:
    1. Download/Initialize GWO Code
    2. Identify which experiment/treatment group the user should see
    3. Download each of the content sections that should change and
    4. Use javascript to dynamically update each of the content sections of the page (FLICKER!)
  2. Static Content. Since you must store the HTML fragments in GWO for the service to return back the appropriate multivariate combination to show to users, the HTML fragments must be static.
  3. Maintenance. This ties back to our #2 reason, we would have to upload each of the HTML fragments into the GWO tool so they could be tested.  Since we have to code them locally to make sure they function & display correctly anyhow, it effectively means we need to duplicate the code to GWO.

The solution?  Leveraging a few very very clever Javascript/HTML tricks this is what we’re doing for our Auto Parts Price Comparison site.


A vanilla GWO Implementation operates similar to:


Tricked out solution looks like:


Why is it clever?  Lets walk through each of the scenarios for the GWO Variations.

Scenario 1: No Javascript Support

<!– Section1 – Default content –>
var GWO_Section1 = utmx(“variation_number”, “Section1″);
if (GWO_Section1 != undefined && GWO_Section1 != 0) document.write(‘<no’ + ‘script>’);
Default content – shown by default<br>

Browser ignores everything inside <script> tags and ignores the </noscript>

<!– Section1 – Variation 1 –>
if (GWO_Section1 == 1) document.write(‘</noscript a=”‘);
Alternative content 1<br>

<!– Section1 – Variation 2 –>
if (GWO_Section1 == 2) document.write(‘</noscript a=”‘);
Alternative content 2<br>

Scenario 2: Browser Supports Javascript – Variation == Control/Default Treatment

<!– Section1 – Default content –>
var GWO_Section1 = utmx(“variation_number”, “Section1″);
if (GWO_Section1 != undefined && GWO_Section1 != 0) document.write(‘<no’ + ‘script>’);
Default content – shown by default<br>

Browser evaluates GWO_Section == 0 so the document.write(<noscript>) is NOT executed resulting HTML code is:

Default content – shown by default<br>

Browser ignores the </noscript>

<!– Section1 – Variation 1 –>
if (GWO_Section1 == 1) document.write(‘</noscript a=”‘);
Alternative content 1<br>

Since GWO_Section == 0, the document.write(‘</noscript is not executed, but however the 2nd <script>document.write IS still executed resulting in

Alternative content 1<br>

The HTML comment starts with the first <!– and keeps going until –> hits (the browser doesn’t even try to interpret the DOM inside the HTML comment – making this the most efficient way to “throw away” code from even being parsed by the HTML engine.  The variation 1 never gets displayed.

Scenario 3: Browser Supports Javascript – Variation == Experiment Treatment

<!– Section1 – Default content –>
var GWO_Section1 = utmx(“variation_number”, “Section1″);
if (GWO_Section1 != undefined && GWO_Section1 != 0) document.write(‘<no’ + ‘script>’);
Default content – shown by default<br>

Browser evaluates GWO_Section1 == 1 so it DOES print the <noscript> resulting in:

Default content – show by default<br>

Browser starts at the first <noscript> and ignores everything until the ending </noscript> – thereby ignoring the control treatment’s HTML

<!– Section1 – Variation 1 –>
if (GWO_Section1 == 1) document.write(‘</noscript a=”‘);
Alternative content 1<br>

GWO_Section1 evaluates to == 1 and the document.write prints out resulting in:

</noscript a=”
Alternative content 1<br>

There’s two clever things in this one. The first is the a=” starts to create an HTML attribute that “ingests” the dangling <!–. The second, is the <!–</script>–> is a complete HTML comment, so there isn’t a dangling </script> tag floating in the HTML. Poof the Web Browser is now showing the Treatment HTML.

After we met with Eric he posted a link and an explanation for this trick at

Editor’s Note: The original link above is currently dead, I’ve reposted the original article for reference at:

We have a Rails Plugin that encapsulates all of this logic pretty well – so you don’t have to worry about all of the javascript hacks:

Happy A/B + Multivariate Testing!

Server-Side Dynamic Section Variations

Editor’s Note: The original URL is: This is an archived/re-post retrieved from the Wayback Machine: Internet Archive of an original blog post written by Eric Vasilik, but re-posted since the site is no longer available & the content has been deemed save-worthy.

This technique is one of my favorites because it involves some pretty “clever” (euphemism for twisted) JavaScript. But don’t let this scare you, the script works on all the browsers and is as fail-safe as the scripting that Website Optimizer requests you place on your sites by default.

Multi-Variate Experiments “Out of the Box”

First, let’s revisit certain aspects of GWO pertinent to this technique. By default, GWO handles multi-variate experiments in the following way:


Your test page and the default content for your experiment sections are served directly from your web server. If alternative content has been chosen to be displayed to a visitor, that alternative content is served from a Google server.

Now, this poses a particular limitation: the alternative content must be static in nature. The reason behind this is in the fact that, in the default setup process of a multi-variate experiment, you are requested to input the alternative content into GWO’s user interface, and that content is simply served back to your test page on demand where it replaces the default content in the page for visitors selected to see the alternative content.

This means that if you wanted to customize that alternative content differently for each visitor, you don’t get a chance to do so. For example, you might want to include the customer’s name in the alternative content. Or, you might want to serve a promotion customized for the given customer.

Alternative Content Served from Your Webserver

The technique I am about to discuss allows you to serve all content, default and alternative, directly from your web server:


Here, the Google server does not serve any alternative content. It only serves back an indicator (an integer) of which content should be show to a given visitor. All the possible variations for the sections are rendered into the web page by your web server where you have complete dynamic control over the content of those variations. In the following, I will show you the scripts you need to generate along with that content in order to show one of either your default content or variations.

Size of Alternatives

That said, one should be careful when using this technique because it requires you to render all possible section variations into the page. Because even though your web server knows what the content of the alternatives are, it does not know which alternative will be chosen for a visitor to your test page. Contact with the Google server is required for that, and the logic about which content to show to the visitor must be executed in the browser client.

So, if the number and size of all the alternative section variations is not too large, you can use this technique. Many times, this is the case. Even if you define your entire page to be a single section, this technique may work for you because only the HTML of the alternative need be present in your page. Any other resources, like images, scripts or style sheets, which are specific to an alternative variation will be loaded if that alternative variation is chosen for a visitor. Content which was not chosen for the visitor will not even be parsed by the browser, it will essentially be thrown away.

Creating the Experiment

To use this technique, you begin by creating a regular multi-variate experiment. Give the experiment a name, test page and goal page. When asked to add the GWO scripts to the page, add the control script in the normal way. And, add the tracking scripts in the normal way.

However, do not follow the default instructions for adding section scripts. I have prepared an example test page you can look at which illustrates the alternative to the default section scripts which allows you to serve alternative section variations from your web server:

Server Side Sections Example:

<h2>Server-side Dynamic Content - Test Page</h2>

<p>Content Before Section1

<!-- Section1 - Default content -->
var GWO_Section1 = utmx("variation_number", "Section1");
if (GWO_Section1 != undefined && GWO_Section1 != 0) document.write('<no' + 'script>');
Default content - shown by default<br>

<!-- Section1 - Variation 1 -->
if (GWO_Section1 == 1) document.write('</noscript a="');
Alternative content 1<br>

<!-- Section1 - Variation 2 -->
if (GWO_Section1 == 2) document.write('</noscript a="');
Alternative content 2<br>

<p>Content After Section1<br>

<script type="text/javascript">
if(typeof(_gat)!='object')document.write('<sc'+'ript src="http'+
<script type="text/javascript">
try {
var pageTracker=_gat._getTracker("UA-7250447-1");

First, you will want to declare the number and names of the server-side dynamic experiment sections you plan to test. Sections are normally declared as a result of surrounding the default content of a section with the standard GWO sections scripts. But, because we are not using those, you need to use an alternative. So, to declare a single section with the name “Section1”, place the following immediately after the Control Script:

<!-- utmx section name="Section1" -->;

You can repeat this kind of comment to declare up to 8 sections. For example:

<!-- utmx section name="Section1" -->
<!-- utmx section name="Section2" -->
<!-- utmx section name="Section3" -->

Note that you can mix regular GWO multi-variate section with server-side dynamic sections. Simply include the standard GWO style sections as described in the default install instructions.

Instrumenting the Sections

The following script is the entire definition of the section from my example page. I show you in its entirety here, and will dissect it later. Note that the dynamic content for each variation is highlighted. These are the parts of the page you get to dynamically generate. Only one of them will be show to a given visitor, the others will be stripped away.

var GWO_Section1 = utmx("variation_number", "Section1");
if (GWO_Section1 != undefined && GWO_Section1 != 0) document.write('<no' + 'script>');
Original content - shown by default<br>

if (GWO_Section1 == 1) document.write('</noscript a="');
Alternative content 1<br>

if (GWO_Section1 == 2) document.write('</noscript a="');
Alternative content 2<br>

The basic idea with this technique is that each of these script blocks controls a piece of content. The first controls the original content. By default, the original content is show to the visitor. The other script blocks control the alternative pieces of content, one of which, is meant to replace the original content. By default the alternatives are hidden from the visitor. If an alternative is chosen to be shown to the visitor, then the script blocks will work together to hide the original and show only one of the alternatives to the visitor.

The content contained in each of these script blocks is totally under your control in your web server. Which one of them is shown to the visitor is under the control of Website Optimizer.

The Default Content

Like any GWO experiment, the default content is encoded in your test pages, and if JavaScript is not present or disabled, or there is any malfunction anywhere, the default content will be presented to your visitors. With this technique, the default content is handled with this script:

var GWO_Section1 = utmx("variation_number", "Section1");
if (GWO_Section1 != undefined && GWO_Section1 != 0) document.write('<no' + 'script>');
Original content - shown by default<br>

Here, the script code firsts obtains the number of the variation for the section named “Section1” chosen for the current visitor:

var GWO_Section1 = utmx("variation_number", "Section1");

This call to the utmx function will return a 0 (zero) or undefined if the visitor should see the default content. This value is saved in a global variable for use in subsequent alternative content scripts. Note that the utmx function is defined by the Control Script which needs to have been executed before the call to the utmx function in this script.

Then, if alternative content has been chosen for this visitor, the default content is hidden from the visitor with the second line of code:

if (GWO_Section1 != undefined && GWO_Section1 != 0) document.write('<no' + 'script>');

By document.writing a beginning <noscript> tag, the content after the script and up to the first </noscript> tag will be consumed and ignored by the parser. This requires that your default content not contain any noscript tags (beginning or ending). This is exactly the same technique used by GWO for standard installations of multi-variate experiments. The only difference is that here we are just removing the default content, but the standard GWO multi-variate technique document.writes the alternative content to replace the default content before writing the <noscript> tag to eliminate the default content.

The Alternative Content

Now, for each variation of alternative content for a given section, you will need a script like this:

if (GWO_Section1 == 1) document.write('</noscript a="');
Alternative content 1<br>

Note that the 1 indicates that this script is customized for the first alternative. The second alternative will have the number 2, the third 3, etc. The larger highlighted part is your dynamically generated alternative content for the first alternative. Simply have your web server surround the alternative content with the other text.

The first line of the script determines if this alternative was chosen to be viewed by the visitor and document.writes some content designed to cause the variation to be shown to the visitor:

if (GWO_Section1 == 1) document.write('</noscript a="');

To understand this better, consider what this content would look like if scripting is disabled, or the value of GWO_Section1 does not have the value 1. That is, there is no script:

Alternative content 1


This entire block of HTML is nothing more than one large comment. In fact, the alternative content is hidden by default by the fact that it is embedded inside a comment. This means that the alternative content must not have any comments in it. Note that even the script at the end of the HTML is also inside the comment.

Now, look carefully at what is written if this section variation has been chosen to be displayed to the visitor:

</noscript a="

This is the beginning of an ending noscript tag. Note that there is no > terminating the tag. Also, in this tag there is the beginning of an attribute. Notice, also, that the value of the attribute is not present and that the ending double quote is not present. That is not two single quotes. It is a single double quote.

Recall that the way that document.write works is that the written text is, essentially, inserted after the end of the script where the browser parser will resume its parsing after the script has executed. Again, by stripping away the first script tag, let’s look at what the parser will encounter:

</noscript a="<!--">
Alternative content 1<br>

Here the parser sees an ending noscript tag with an attribute whose value are the characters which begin a comment. The thing to know here is that HTML parsers allow beginning comment sequences inside attribute values. This is the clever (twisted) part I eluded to earlier.

Now, it should be apparent why there was the “> characters immediately after the beginning comment character sequence: <!–. It is there to terminate the dynamically written ending noscript tag. This tag “eats” the beginning comment token. Yummy.

This allows the parser to parse and display the alternative content. Now, all we need to do is deal with the remaining ending comment token! This is done by the last part of the script:


Which injects a beginning comment token which is terminated by the remaining ending comment token, statically present in the page. Without this document.write, the “–>” would appear in the page when this alternative content was chosen for the visitor.

This is how each alternative variation is handled. Simply do the above for each alternative section. Each one will have the server generated content of the variation. Each script will have the number of the variation encoded in it. 1 for the first alternative, 2 for the second, etc.

You can repeat this sequence of scripts for a section as many times on your page(s) as you want to hide/show the default/alternative content for that section.

Setting Up the Variations in GWO

Even though this technique requires you generate all your alternative content into the page, you will still need to create section variations in Step 3 of the GWO user interface for each section in your test. The only difference is that you do not supply any content for these variations. The reason for this is that GWO still needs to know how many variations each server-side dynamic section has for the purposes of choosing which variation visitors will see and reporting results. For example:


Shows the section named “Section1” with two (2) variations created for it. I give each variation a name for reporting purposes, but I do not need to give it any content.

After Doing this, all that is left to do is preview the experiment to make sure the scripts are working and launch the experiment!