Performance Tuning Websites

Download speed used to be one of the ways you could tell a real web pro from a graphic designer that knew how to make things pretty.  One of our excercizes used to be to make pages in a single table (this predated CSS), but carefully spanning rows and columns to move your content around, a pain to develop, but fast to render, since Netscape’s browser used to be terrible at “embedded tables.”  Obviously, this is archaic (along with worrying about 28.8k modem download speeds), but the concept of page that is fast to download and fast to render never went away.  When Google officially announced that it would take page load into account, people started to finally pay attention.  One of the best “starting points” it’s Yahoo’s Performance Guide.

However, the process of making a website fast is pretty straight forward:

  1. Home Page and other enterances: VERY fast and simple
  2. Limit third party items that might cause delays via DNS or download
  3. Prevent things that can get VERY slow from being on these pages

One and two are the ones most often paid attention to, but #3 is potentially the biggest impact and most ignored.  For example, adding gzip to your server cuts file transmission size, that saves time, and is nice, and can get 80 ms loads to 20 ms loads, but #3 is where page loads can move from 100 ms to 10-15 seconds.  For example, if you query the database to build your navigation, it’s easier to manage your navigation, but a hiccup at the database level (that locks that table), and your site hangs loading.  A solution like Memcached moves your “read only” data out of the database and into RAM.  You can still manage it in the database, and the site will update relatively quickly, but there is no reason to consult the database multiple times for information that changes infrequently.

Third party servers often get ignored, but you have no control of when they have problems.  Serving Javascript from a third party has the advantage that you don’t have to maintain it, but puts you at risk of the user’s experience massively degrading.  Consider removing as many third party elements as possible.  Solutions like Google’s Javascript based tracking for Google Analytics has the advantage of having near-zero impact on page load (except the transmission of the text – and the download of the Javascript library), but unlike images, tends to not have performance problems, and the site will load even if it is having trouble obtaining the Javascript library at the bottom of your page.

Getting the 50% – 200% improvements are great, but a real focus on the few things that can explode out of control will serve you better in the long run.

Mobile Web Like Web in 90s (Usability)

Usability is generally ignored on the web today, not because it isn’t a big deal, but because the “common” design patterns are all reasonably usable.  Users are comfortable with the interface, nobody really does remarkably stupid things.  In the late 90s and early 2000s, that wasn’t the case.

Today, the mobile web is the talk, and apparently, we have the same usability problems that we had 10 years ago…  While users have an 80% success rate attempting a task on the web on their computer, it drops to 59% on their phone.

“Observing users suffer during our  … sessions reminded us of the very first usability studies we did with traditional websites in 1994,” writes Jakob Nielson (free plug, I found this article from his website, Use It.  Indeed, the Web 2.0 “Design strategy” of two columns over 3, most common operation front and center, and large fonts show that the Web 2.0 “revolution” largely involved Flash being replaced with sensible Javascript and Designers finally listening to usability guidelines, either intentionally or accidentally.

The oddest thing about the computer/IT industry is that it doesn’t maintain institutional knowledge or learn from the past.  When basic web-forms were decried as a throwback to the 3270 Mainframe model, you would think that the old Mainframe hands would be considered experts, but in an industry where 18-25 year olds can be productive, there is no interest in expertise.  As the mobile web becomes more and more important, usability may make the difference between success and failure.  The idea that I should go to my computer to check a map seems as ludicrous as the idea that I should use the phone book!

WP.me: Bad Idea, But Predictable

Short URLs like Tinyurl.com were created for serve a valuable purpose, as URLs get long (think long query string, or SEO friendly long text strings), emailing a link is problematic for those using text mail clients as the text wraps around.

Twitter’s use of “shortened” URLs for the 140 character limit are totally arbitrary.  If you are sending it via SMS, the protocol supports a URL being passed along as data, not text.  Further, one could always shorten the URLs for SMS purposes and not on the web.  And on the website, you could use anchor text, the words that you click on, instead of the URL itself.

Nonetheless, Twitter decided to not support URL as special items, and the shrunken URL became a part of Twitter culture and it is here for any area that posting a link doesn’t show anchor text.

Now WP.me is a horrid idea.  Creating a special WordPress.com URL isn’t a horrible think, for those that are Username.wordpress.com, switching to Username.wp.me seems pretty harmless, and offering a shrunken URL format seems fine.  The “Permalink” of /year/month/day/URL-friendly-title works for Pre-2000 Internet days that the search engines still live in, making it SEO friendly, but less friendly for today’s world of Social Media and quick URL sharing.

However, that doesn’t appear what they are doing.  They appear to be pushing it as a shortening service, so you can still be AlexHochberger.com, but your links will be WP.me/ASDFAD if you choose to use Short URLs.  I suppose this serves a purpose for Twitter posts worried about Link Rot, but it also may trap you on WordPress.com.  If you outgrow their limited Blog feature set, how do you make certain that your WP.me links don’t rot out.  Wordpress.com seems a bit more stable than Bit.ly, but if Bit.ly survives long term, your links are save, WP.me may only work on a single host.

Given Bit.ly’s sharing a VC relationship with Twitter, I think that they are pretty safe, because if they can’t figure out a business model, VCs can usually force a merge up of their two investments.

Media Submarkets on the Web

I love reading what marketing focused online marketers have to save, because coming from a technology background, I like understanding what my colleagues without a background in tech thing are the market moving forces.  I’ve been quoting a bunch of articles from Media Post, because the daily emails often prompt a good opportunity to think.  Mr. Allen inadvertently suggests that social media of today traces its roots to the early days of the web, and while he is correct that the desire for interactivity shows signs at the early web, the underlying technology has supported this.

Though today’s websites share no common code with the BBS world of the 1970s-1990s, it shares a cultural desire to share information, files, and resources in an online manner.  Some of the early Unix BBS systems were designed to support information sharing like a dial-up BBS between local users at a University, albeit over the TCP/IP network and Telnet instead of a modem and a terminal emulation/dialer program.

However, the “Social Media” world of today required a certain technological shift.  The “Web 2.0” technology shift, and the AJAX acronym didn’t require new technology, but did require a changing software landscape.  In 2001, when I started in this business, trackable links that didn’t break search engines required custom coding and careful management of the HTTP protocol responses.  In 2009, you go to bit.ly and it does it for you.  In 2001, building a website required building an article repository to manage content, in 2009 many CMS systems are available off the shelf.  In 2001, SEO was emerging from the hacker days of Altavista, and riding the PageRank mathematics of Google’s rise and Yahoo’s use of Google PageRank for sorting.

Why does this matter?  In 2001, building a website required technology skills.  In 2009, WordPress.com has you up and running in 15 minutes, and you can start working on your site.  The early promise of the Web was two-way communications.  Netscape shipped with an HTML editor, because the whole concept of Hypertext was easily shareable and editable documents.  The HTTP spec had concepts of file movement that were never adopted until the DAV group realized that you could do collaborative editing with them.  HTML editing turned out to be too complicated, but Web 2.0 featured the concept of mini code.  IFrames let websites include content elsewhere, but you were at their mercy for displaying it.  Instead, we have interactive forms that pull information from anywhere.

The social media of today traces it’s social roots to the first acoustic modem on a computer, but the technology is new.  When AJAX came out as a popular acronym, it became socially acceptable to put critical content behind a Javascript layer, previously a no-no of web design, Javascript for convenience was accepted, but not required.  The underlying technology was there, but easy libraries brought it to the junior programmers.

Designing a high end website still requires technology and database skills.  But prototype-grade environments like Ruby-on-Rails and CakePHP brought RAD concepts from the desktop application Visual Basic world to web programming for everyone.  And while it certainly brought out many applications that don’t scale, it made these rapid fire AJAX/JSON mini web services easy to write, and that made the social media world possible.

So while marketers may see this evolution of mini-markets, they miss the underlying technology shift.  Once a media is cheap to create, the advertising on it becomes affordable.  The wire service made real reporting cost effective, the web made mail-order effective, and the underlying language libraries let companies without a technology team build interactive websites, creating these markets.

CMS and CPU Usage

I have normally been adverse to Content Management Systems (CMS), because they generally are coded poorly to work in a “plugin” format.  Each page routinely makes dozens of database calls, which can put a big strain on the CPU.  On the other hand, the let an individual programmer quickly add LOTS of functionality that would have required a team of programmers months to develop.  I used to find them particularly heinous because they destroyed SEO attempts, but all the modern systems let you have a reasonable URL structure.  As a result, if you are successful and decide to build the $100,000 website, you can always point the old URLs to the new location and not break links.

However, something that was a reminder today, a spike in traffic can destroy your server if you aren’t optimized.  If you are getting promoted on television, being interviewed, or otherwise getting mentioned on a popular program that might send a few thousand people to your site at one time, be careful.  Even if bandwidth isn’t a problem, CPU and Memory might be.  If you are expecting a spike, make your home page static.  Most of your visitors will come there, and if you make it a static page (mod_rewrite them to the dynamic script if they have a logged in cookie), you’ll drastically cut your database load.

In fact, I think most sites would do well to always make the home page static.  Something we did “back in the day,” was program the whole site dynamically, then “mirror” the home page to a file with wget or something.  One could have most of their site mirrored to static files, and serve up dynamic pages to logged in users.  Historically, that’s what Slashdot used to do.

Either way, you should remember to optimize your main queries, and create the appropriate indexes as part of bring a site life.  Bandwidth isn’t the only constraint, sites without dozens of servers need to worry about CPU and memory usage as well.  A popular television show can send WAY more simultaneous traffic than social media or search engines, at least at one time.