About Adam Taylor...

I am a student, blogger, lazy entrepreneur....


I write about: Analytics, blogging, search engine optimisation and social media marketing.


Find out more...

Archive: Gray Hat SEO

Are Madlib Sites Whitehat?

I’m never really sure what to write about these days, so I figured it was about time I checked my logs to see what people were looking for, I mean hell, it’s what the gurus suggest (damn I was looking for a relevant link but couldn’t find one; still I’m sure that’s what they suggest)!

What the hell is a madlib site?

Madlib, or mad libs as Wikipedia refers to it, was a children’s game where you had to come up with alternative words and then fill in blanks in a story. Something like that anyway, I’ve never played it, I’m not American.

How does this relate to a website? The idea is usually implemented in such a way that one can spin large amounts of legible, legitimate (? - we’ll get to that later) content from a small amount of content.

So for example, the title of this post is ‘Are Madlib Sites Whitehat?’. If you wanted to ‘madlibify’ that title you may come up with something like this:

Are (Madlib|Mad Lib|Mad-Lib) (Sites|Websites|Blogs) (Whitehat|Blackhat|Grayhat|Illegal|Legitimate)?

1. Are Madlib Blogs Whitehat?
2. Are Mad Lib Websites Illegal?
etc.

Now you could fire this through your madlib function and you have a load of unique titles for a small amount of input.

Take this idea further, and you can see how it could be used to create 100 blog posts instead of 1 blog post. Or alternatively, let’s say you have a massive database driven website, you might ‘madlibify’ what would otherwise be static portions of each page to hopefully get more pages indexed.

Madlib sites are almost certainly not whitehat

Nearly every method of content creation that involves some kind of automation is pushing or breaking the limits of what search engines consider to be legitimate, whitehat content.

I think it’s a pretty safe bet to assume that if a search engine engineer came across a blog of yours with 100 very similar posts they would nuke it regardless of how unique/readable/whatever the content was.

Depending on your goals, it may be a fine option, makes a change from the usual scraping at least.

If you are genuinly interested in a fully fledged system involving madlibbing/content creation/automatic posting/blog farms etc I’d take a look at datapresser.

Mad-Lib Perl Snippet

I bashed up a quick mad-lib perl class for use in some future scripts. If you don’t know what mad-lib is it’s basically a function that when given an input such as us:we:i:them:she:he will select one. Given a whole block of text with multiple parts to be ‘mad-libbed’ it can crank out many chunks of readable, unique content.

#!/usr/bin/perl
package madlib;
use Moose;

## Fields for madlib objects
has 'phrase' => (isa => 'Str', is => 'rw');
has 'debug' => (isa => 'Bool', is => 'rw', default => '0');

## Function to select a random word from the phrase
sub madlib() {
my $self = shift;
if ($self->debug) { print "Phrase = ".$self->phrase."\n"; }
my @words = split(/:/,$self->phrase);
my $length = @words;
if ($self->debug) { print "Length of \@words = ".$length."\n"; }
my $rand = int(rand($length));

return($words[$rand]);
}

return 1;

=head1 NAME
madlib - A class to perform 'madlib' function.

=head1 SYNOPSIS
use madlib;
my $object = madlib->new();
$object->phrase("some:phrase:to:madlib");
$object->debug(1) # to get some debugging output
print $object->madlib();

=head1 DESCRIPTION
This class provides the ability to 'madlib' a phrase/string to create unique content variations.

=head2 Methods
=over 12
=item C
Returns a new madlib object.
=item C
Returns a random word from the specified phrase.
=back	

=head1 AUTHOR
Adam Taylor - http://www.conversion-matters.co.uk.

=cut

Won’t do much on its own but used within some other systems it could be quite useful. I’m thinking directory submission, social bookmarking etc..

(Yes it displays wonky; apologies…)

Creative Technique For Master-Link-Baiters

I was in the shower and an idea popped into my head, not something I’ve heard explicitly talked about before. I can only assume this is due to one of the following:

  1. It could be a completely rubbish idea
  2. It may be being used by sneaky people who don’t want it revealed
  3. It could be a unique and creative idea not thought of before (unlikely but one can dream.. ha!)

Eitherway, I don’t have the required skills to test the technique so I’ll just talk about it and let other people decided whether it’s awful or not.

I saw a tweet by Lyndoman about a piece of link bait he’d written on the subject of credit cards that hit the front page of digg.

Then my brain went on a random path of thoughts, something like:

‘eh front page digg -> wonder how much he makes -> maybe he should bait and switch or use social media to promote his own affiliate sites; could make some nice extra dough -> hey, how about bait and cloak?’

Introducing Bait and Cloak

I imagine someone has already thought of this idea but it’s just possible that it has never been executed; the time and effort required to pull it off would probably off-put many black hats who might be tempted to try it.

What’s the frigging idea Adam?!

Everyone’s heard of bait and switch I assume: essentially writing link-bait and then 301 redirecting to another page. The problem is, if people notice, some of them get a bit pissy with you.

The bait-cloak idea’s simple:

  • Start a blog
  • Write quality link bait for 3-6 months on various topics: credit cards, gambling, travel etc.
  • Start redirecting the post to highly targeted, niche affiliate offers to googlebot (based on IP of course) and to people who arrive via Google.

After many successful front-pages, the domain should have built enough trust and authority to be able to rank for these offers.

Also this way people visiting from ‘citations’, or from Digg/Stumble etc. still see the linkbait but others see the affiliate offers.

Someone would no doubt notice eventually, but if you keep it quite niche/long tail you could probably keep this going for quite a while (long enough to turn a profit I reckon).

Give it a go Lyndoman or Graywolf - I know you love the link-bait and the shadier side of SEO ;).

Good idea/rubbish idea? Your thoughts welcome as ever..

A little script for you…

A fair few weeks ago I wanted to see if I could still code any Perl, which I’m glad to say [I think] I can, so I decided to port a little php script to Perl. For some reason I can’t access the original post/script anymore, but it was deStone’s delicious referral spammer - if you can get the link to work, your computer is better than mine!

So all it does is load up a list of keywords, go through each of the associated tag pages and then visit each site with any refer URL you want to promote.

It’s pretty pointless, I can’t imagine too many webmasters still look at old school stats packages, but I could be wrong.

#!/usr/bin/perl -w
use strict;
use WWW::Mechanize;

&main();

sub main {
	my $throttle = 10;
	my $mech = WWW::Mechanize->new();
	$mech->agent_alias( 'Windows IE 6' );

	## get our list of keywords from the dictionary file
	open (KEYWORDS, '/Users/adam/Documents/Perl/dictionary.txt');
	my @tags;
	while () {
		my $tag = $_;
		chomp($tag); push(@tags,$tag);
	}
	close (KEYWORDS);

	my $url;
	my $result;

	## grab del.icio.us url for each tag
	foreach my $tag (@tags) {
		print “$tag\n”;
		$url = “http://del.icio.us/tag/” . $tag . “?setcount=100″;
		print “$url\n”;
		$mech->get( $url );
		$result = $mech->content;
		&process($result, $mech);
		sleep($throttle);
	}
}

## scrape del.icio.us for urls
sub process {
	my $result = $_[0];
	my $mech = $_[1];

	my @urls;
	my @links = $mech->links();
	foreach my $link (@links) {
		if ($link->url() =~ /^http/g && $link->url() !~ /del.icio.us/g) {
			#print $link->url() . “\n”;
			push(@urls,$link->url());
		}
	}

	&ref_spam(@urls);
}

## hit up each site with our url to promote as the referer
sub ref_spam {
	my @urls = $_[0];
	my $mech = $_[1];

	my $promote_url = “http://www.test.com”;
	$mech->add_header( Referer => $promote_url );

	foreach my $url (@urls) {
		$mech->get($url);
	}

}

[I’m aware it displays a bit messed up - you’ll have to deal with it.]

This no longer works for me ‘cos I’m in halls and would need to add proxy support, but I think it worked ;)!

Play around with it if you want; I’m not advocating any particular uses for it or being massively helpful on how to run it - deliberately.

I’m sure it could be improved upon - to use proxies, threads or made more OO but I only hacked it up to check I could still code Perl.

Here’s the dictionary file and the perl script (rename to .pl).

Tamar Goes To Town 121 Tamar Goes To Town 120 Tamar Goes To Town 119 Tamar Goes To Town 118 Tamar Goes To Town 117 Tamar Goes To Town 116 Tamar Goes To Town 115 Tamar Goes To Town 114 Tamar Goes To Town 113