Developers don’t do SEO. They make sure sites are SEO-ready.
That means developers hold the key to SEO. It’s true. If you’re a developer and you’re reading this, laugh maniacally. You’re in control.
You control three things: viability, visibility, and site flexibility.
This post provides guidelines for all three.
This isn’t a navel-gazing philosophical question.
For this article’s purposes, a developer connects site to database (or whatever passes for a database, don’t get all anal-retentive on me), builds pages using the design provided, and does all the work those two jobs require.
A developer does not design. They do not write content. If you do all three jobs, tell the designer/content parts of your brain to take a break. This post isn’t for them.
Viability: Stuff you do on the server and in early software configuration that readies a site for ongoing SEO.
Mostly I chose this word because the other two ended with “ility,” and it just works.
Server logs are an SEO source of truth. Log file analysis can reveal all manner crawler hijinx.
Every web server on the planet has some kind of HTTP log file.
And now someone’s going to tweet me their platform that, in defiance of all logic, doesn’t generate log files. OK, fine.
99% of web servers on the planet have some kind of log file.
Happy? Great. Now go make sure your server generates and saves HTTP logs.
Most servers are set up correctly out of the box, but just in case, make sure log files include:
Also make sure that:
Log files, folks. Love ’em. Keep ’em. Share ’em.
Why does everyone treat analytics like a light switch? Paste the script, walk away, boom, you’ve got data.
Nope.
Before you add that JavaScript, make sure your analytics toolset—Google, Adobe, whatever—can:
Is this all SEO stuff? Not exactly. But it all helps the SEO team. Is this your job? Maybe not. But you’re on the Dev team. You know you’re the top of the escalation tree for everything from analytics data to printer malfunctions. When they can’t find the data they need, the SEO team will end up at your door.
Hopefully, you already know all about robots.txt. If not, read this guide.
Even if you do, keep in mind:
Use the right response codes:
200: Everything’s OK, and the resource exists
301: The resource you requested is gone forever. Poof. Look at this other one instead
302: The resource you requested is gone, but it might be back. Look at this other one for now
40x: The resource you requested can’t be found. Oops
50x: Gaaaahhhhh help gremlins are tearing my insides out in a very not-cute way. Nothing’s working. Everything’s hosed. We’re doomed. Check back later just in case
Some servers use 200 or 30x responses for missing resources. This makes Sir Tim Berners-Lee cry. It also makes me cry, but I don’t matter. Change it.
Even worse, some CMSes and carts come configured to deliver a 200 response for broken links and missing resources. The visiting web browser tries to load a missing page. Instead of a 404 response, the server delivers a 200 ‘OK’ response and keeps you on that page.
That page then displays a ‘page not found’ message. Crawlers then index every instance of that message, creating massive duplication. Which becomes a canonicalization issue (see below) but starts as a response code problem.
Yes, Google says they’ll eventually figure out whether you meant to use a 302 or a 301. Keyword: eventually. Never wait for Google. Do it right in the first place.
I make no judgments regarding the pluses or minuses of these. But plan ahead and configure them before you launch:
Check ’em off now, so you don’t have to deal with them later:
I just found out that I have high cholesterol, which is irritating because I eat carefully and bike 50–100 miles/week. But whatever.
MY POINT HERE is that server viability fights potential blockages by making sure your SEO team can get straight too…
This is a horrible analogy. Moving on.
This is what everyone thinks about: How you build a site impacts search engines’ ability to find, crawl, and index content. Visibility is all about the software. How you build the site impacts it.
Every resource on your site should have a single valid address. One. Address. Every page, every image.
Canonicalization problems can cause duplicate content that, in turn, wastes crawl budget, reduces authority, and hurts relevance. Don’t take my word for it. Read Google’s recommendation. If you follow these recommendations, you’ll avoid 90% of canonicalization problems:
If your domain is www.foo.com, then your home page should “live” at www.foo.com.
It shouldn’t be
www.foo.com/index.html
www.foo.com/default.aspx
www.foo.com/index.php
or anything else. Those are all canonically different from www.foo.com. Make sure all links back to the home page are canonically correct.
Don’t depend on rel=canonical or 301 redirects for this. Make sure all internal site links point to the same canonical home page address. No site should ever require a 301 redirect from internal links to its own home page.
Make sure that the link to page one of a pagination tunnel always links to the untagged URL. For example: If you have paginated content that starts at /tag/foo.html, make sure that clicking ‘1’ in the pagination links takes me back to /tag/foo.html, not /tag/foo.html?page=1.
Friends don’t let friends create links like this:
<a href=‘~’>
Those can create infinitely-expanding URLs:
/en-us/ /en-US/US-Distribution /en-US/~/link.aspx?_id=6F0F84644AC94212ACA891D5AE1868C9&_z=z /en-US/~/~/link.aspx?_id=B682300BEAD24C0ABC268DB377B1D5A0&_z=z /en-US/~/~/~/link.aspx?_id=6F0F84644AC94212ACA891D5AE1868C9&_z=z /en-US/~/~/~/~/link.aspx?_id=B682300BEAD24C0ABC268DB377B1D5A0&_z=z /en-US/~/~/~/~/~/link.aspx?_id=6F0F84644AC94212ACA891D5AE1868C9&_z=z /en-US/~/~/~/~/~/~/link.aspx?_id=B682300BEAD24C0ABC268DB377B1D5A0&_z=z /en-US/~/~/~/~/~/~/~/link.aspx?_id=6F0F84644AC94212ACA891D5AE1868C9&_z=z /en-US/~/~/~/~/~/~/~/~/link.aspx?_id=B682300BEAD24C0ABC268DB377B1D5A0&_z=z
Never hard-code relative links, unless you want to be the comic relief in an SEO presentation.
Don’t use query attributes to tag and track navigation. Say you have three different links to /foo.html. You want to track which links get clicked. It’s tempting to add ?loc=value
to each link. Then you can look for that attribute in your analytics reports and figure out which links get clicked most.
You don’t need to do that. Instead, use a tool like Hotjar. It records where people click, then generates scroll, click and heat maps of your page.
If you absolutely must use tags, then use /# instead of ? and change your analytics software to interpret that, so that ?loc=value
becomes /#loc=value
. Web crawlers ignore everything after the hash sign.
Whether you have canonicalization issues or not, make sure you:
It’s best to fix canonicalization issues by doing it right: build your site to have a single address for every page.
If you can’t do that, though, use these:
Please don’t do these things:
In other words, no funny business. Do it right from the start.
Performance is done to death, so I’m going to keep it short. First, a brief sermon: page speed is an easy upgrade that gets you multiple wins. Faster load time means higher rankings, sure. It also means higher conversion rates and better UX.
First, run Lighthouse. Sample several pages. Use the command line to batch things. The Lighthouse Github repository has everything you need.
Lighthouse isn’t perfect, but it’s a helpful optimization checklist. It also tests accessibility for a nice 2-in–1.
Do all the stuff.
Regardless of the test results:
You can also consider installing page speed modules. I’d never do this. I don’t want Google software running directly on my server. But they do a lot of work for you. You decide.
A few other quick tips:
Chances are, someone else will add a bunch of third-party scripts and clobber site performance. You can get off to a good start:
If you’re loading assets from a separate site, consider using DNS prefetch. That handles the DNS lookup ahead of time: <link rel="dns-prefetch" href="//foo.com" />
That reduces DNS lookup time. More on that:
Find the most popular resources on your site and use prefetch (not to be confused with DNS prefetch, above). That loads the asset when the browser is idle, reducing load time later: <link rel="prefetch" href="fonts.woff" />
Be careful with prefetch. Too much will slow down the client. Pick the most-accessed pages and other resources and prefetch those.
Build your site to avoid ‘thin’ content: pages with very little content and little unique information.
Avoid these things. Don’t laugh. I still find this kind of stuff in audits all the time:
Don’t wait for an SEO to make you go back and fix it. Build to prevent this kind of stuff:
window.location
or something similar. Crawlers will ignore everything after the hashWe’ve already dealt with title elements and such, so this is a lot easier. Every page should:
While heading tags don’t necessarily affect rankings, page structure as evidenced by rendering does. H1 is the easiest way to represent the top level in the page hierarchy.
Have a single H1 that automatically uses the page headline, whether that’s a product description, an article title, or some other unique page heading. Do not put the logo, images or content that repeats from page to page in an H1 element.
Allow multiple H2, H3, and H4 elements on the page. Let content creators use H2, H3, and H4. You can let them drill down even further, but I’ve found that leads to some, er, creative page structures.
Any developer knows this. Content creators sometimes don’t. I still see many writers insert double line breaks. It’s not easy, but if you can somehow enforce the use of <p> elements for paragraphs, it will make later tweaks to styles a lot easier.
At a minimum, generate structured markup for:
See schema.org for more information. Right now, JSON-LD is the most popular way to add structured data. It’s easiest, and if you (properly) use a tag manager, you can add structured data to the page without changing code.
I can hear you. No need to mutter. You’re saying, “None of this impacts rankings.”
It may. It may not. But using standard page structure improves consistency across the site for every content manager and designer who will work on it. That leads to good habits that make for a better site. It leads to less hacky HTML code pasted into the WordPress editor. That means a more consistent user experience. Which is good for rankings.
So there.
Video libraries are great, but having all of your videos on a single page makes search engines cry. Put each video on its own page. Include a description and, if you can, a transcript. Link to each video from the library. That gives search engines something to rank.
Where possible, create URLs that make sense. /products/shoes/running is better than /products?blah–1231323
Readable URLs may not directly impact rankings. But they improve clickthrough because people are more likely to click on readable URLs.
Also, Google bolds keywords in URLs.
Finally, what are you more likely to link to?
/asdf?foo=asdfakjsd;fkljasdf
or
/asdf/shoes/ ?
Yeah, yeah, go ahead and hurl insults. I’ve heard it all before. If you want to argue about it, go read this post first.
All quality content should ‘live’ on the same domain. Use subfolders. The blog should live at /blog. The store should live at /store or similar. I always get pushback on this one. Google has said in the past that subdomains are OK. Yes, they’re OK. They’re not the best. Google says subdomains are sometimes just as good. Not always.
When Googlebot comes across a subdomain, it decides whether to treat it as a subfolder or not. Like many things Google does and says, they’re unclear about it and results differ. I have no test data. I can say this: in most cases, moving content to a subfolder helps, if by ‘most’ we mean ‘every site I’ve ever worked on.’
So why leave it to chance? Use a subfolder now, and you won’t have to deal with subdomains and unhappy marketers later.
There are two exceptions to the rule:
The most common reason folks use subdomains is the blog: The CMS, or server, or something else doesn’t support a blog. So you set up a WordPress.com site.
That ends up being blog.something.com. If you have to do that, consider using a reverse proxy to put it all under one domain. Of course, if you have no choice, use a subdomain. It’s better than nothing.
Just don’t. Nofollow is meant to prevent penalties for links from comments and advertising. It doesn’t help channel PageRank around a site. It does burn PageRank. It’s a bad idea.
The only time to use nofollow is to avoid a penalty because you’re linking to another site via ads or other paid space on your site. A good rule of thumb: If you’re doing something ‘just’ for SEO, think carefully. Nofollow is a good example.
Clicking the top-level navigation should take me somewhere other than ‘/#.’.
Top-level nav that expands subnav but isn’t clickable creates three problems:
Make sure clicking any visible navigation takes me somewhere.
If you want a page indexed, I need to be able to reach it by clicking on links. Forms, JavaScript maps, etc. aren’t enough. For example: If you have a stores directory, keep the map and ZIP code search.
Just make sure there’s also a clickable index I can use to find stores. That means I can link to it, too. This rule is particularly important when you work with JavaScript frameworks. See the next chapter for more about that.
Until, oh, last week (seriously, Google just changed this last week), Google said they wouldn’t consider content that only appeared after user interaction. Content behind tabs, loaded via AJAX when the user clicks, etc. got zero attention.
Last week, the big G said they do examine this content, and they do consider it when determining relevance. I believe them, but as always, they’ve left out some details:
Oh, also: The old tiny-content-at-the-bottom-of-the-page trick still doesn’t work. That’s not what they meant.
JavaScript isn’t bad for indexing or crawling. JavaScript is bad for SEO.
Instead of typing yet another diatribe about the evils of Javascript, I’ll link to mine and add a few quick notes:
First, before you get into complicated ways to mitigate the SEO problems caused by many frameworks and JavaScript widgets, ask yourself, ‘Why am I building my site this way?’ If there’s no compelling argument–if using a framework doesn’t offer essential features–consider doing something else.
This is the easy part: if you’ve got content on the page for which you want to rank, don’t hide it behind a tab, an accordion, or whatever else. On a well-designed page, people who want to see everything will scroll down. If they don’t want to see it, they weren’t going to click the tab anyway.
If you want content indexed, don’t deliver it based on a user event. Yes, Google says they now index content that reveals after user interaction. Play it safe, though, if you can.
Look at your site’s HAR. Anything that appears after the ‘load’ event is probably not going to get indexed: the Load event, in an HAR
Make sure whatever you want indexed appears before then.
See Make Content Clickable. URLs with /#! and similar won’t get crawled. Google deprecated that as an indexing method.
If you must use JavaScript content delivery, try to mitigate the damage.
No one thinks about this. No. One. SEO requires non-stop tweaks and changes by content managers, analysts, designers, and lots of other non-developers. If they can’t do the work, they bury the resource-strapped development team in requests.
SEO grinds to a halt, and organic performance falls.
I mean, if you have infinite dev resources no worries. Skip the rest of this article. Go back to feeding your pet rainbow-crapping unicorn.
Otherwise, keep reading this relatively brief section.
The title element is a strong on-page organic ranking signal.
First: the meta keywords tag is utterly useless and has been since, oh, 2004. Remove it. If your SEO protests, find a new SEO. With that out of the way, make sure each page has the following editable META tags:
Every page should have an editable description meta tag. The description tag doesn’t affect rankings. It does, however, affect clickthrough rate, which can mean organic traffic growth even if rankings don’t improve. Like the title tag, make the description tag a separate, editable field.
If the page is a product page, have the description tag default to the short product description. If the page is a longer descriptive page, have the description tag default to the first 150 characters of the page content. Never have a blank meta description! If you do, Google and Bing will choose what they think is best. Don’t rely on them.
Facebook uses OGP tags to build the text, image, and title of shared content. Without it, Facebook may use the title and meta description tag and pick an image. It may pick something else. OGP tags let the content creator control what will appear on Facebook and, like the meta description tag, they can boost clickthrough.
Have the OGP tags default to the page’s title, meta description and featured image. Then let the author edit them. At a minimum, include og:title, og:type, og:image and og:url. You can read more about OGP tags at http://ogp.me/.
Twitter cards are more niche. Twitter will use OGP tags as a fallback, so these aren’t required. If you can add them, though, it gives content creators even more control over what Twitter shows for shared content.
Twitter cards can double clickthrough and other engagement. They’re worth the effort. See https://dev.twitter.com/cards/overview for more information.
The ALT attribute is another strong ranking signal. Every image uploaded as part of page content must be editable when the user uploads it. If they do not enter an ALT attribute, default to:
I recommend including “Image:” so that screen readers and other assistive devices identify the snippet of code as an ALT attribute.
Overuse of classes can create headaches. Use semantic CSS wherever possible: Instead of using “.h2” for example, use “h2” . (lousy punctuation to make sure the CSS is clear).
This tip stolen shamelessly from Martijn Oud.
Last updated 2019. Things change. Check back for new stuff.
Searching online for resources to perform a social media audit will most often produce results…
The Google Ads sitelink is one of the most powerful ad extensions an advertiser can…
It’s true; PDFs can be crawled, indexed, and even ranked by search engines. However, most…
So you're in the market for a free visualization tool, and you’ve decided on Google…
There are a few options when it comes to buying programmatic advertising. Brands can work…
Domain Authority (DA) is a proprietary measurement created by a leader in SEO, Moz.com. It…