 |
I know of two more tools for analyzing web site traffic - Google Analytics and Alexa. With Google Analytics, the site owner puts a script (provided by Google) in each web page to be analyzed. The script sends Google the visitors data for analysis (hence "Thinking outside the box"). Alexa takes an "Outside looking in" approach - PC users install the Alexa Toolbar, which collects data on their browsing habits. Alexa then uses this data to estimate the number of users that go to each site.
Google Analytics offers a granular view of site traffic, how/where it originates (searches, referrals or direct visits). Reports show what visitors are doing when they come to the site. All a site owner needs to do is open a Google account and paste the script in the pages (I included the script in PmWiki's CSS file's header section). Google Analytics offers free, powerful reporting features, which can be accessed on line, or emailed at preset times to selected individuals. This service is a win-win to both user and Google. I get an incredible amount of data on my users without spending a cent, and Google gets first hand user data, which they can use for world domination.
Alexa approach is to track where people who have installed the Alexa tool bar application are surfing. From this data, Alexa can tell how visits to your site rank relative to other sites, or compare the popularity of two sites over a period of time. The validity of the analysis is based on the assumption that their traffic data can be extended to the community at large. This assumption is almost surely wrong - Alexa users are a group whose characteristics are almost certainly different from the Internet users universe. For one, Alexa's site is in English, so o English speakers likely weigh heavily in the traffic analysis. Users with privacy concerns, or under repressive regimes, may be less likely to send their traffic to Alexa for analysis, etc. As in every statistical analysis, inference accuracy drops when less data is available, i.e. when a site is not very popular. To Alexa's credit, they openly recognize this fact. Graphical information is only available for sites in the top 100,000 sites. Still, Alexa is a nice tool for a 'quick and dirty' traffic comparison between sites. For example, I find that words2u.net has a traffic rank of 7,737,112, which is up 3,297,142 places from 3 months ago. Alexa connects to the Wayback Machine archives to provide snapshots of web sites in past time. If you do not believe in reincarnation, a visit to that site might change your mind, especially if you happen on a site that changed ownership. Web site owners can check their ranking by pointing their browser to www.alexa.com/data/details/traffic_details/web.site.name.
Vanity searches are searches one conducts on his/her own name. Apparently the practice is common enough to have a wikipedia entry. I consider vanity searches as the virtual equivalent of a credit check - to see what mischief is being conducted in your name, who is appropriating your content for their own use, etc. In the context of web hosting, the vanity search is not personal, but professional - it aims to unearth signs of the site's popularity. One metric is the number of hits on a search using the site's URL. Another possible metric is the position of unique terms (used in the web site) in a search. Use quotation marks to ensure the search looks for the exact term, rather than for combinations of the individual words.
Another useful search strategy is to tracks inbound links, links to the site from other pages. To see how many other pages Google finds that link back to your Web site, just enter the phrase "link:" before the URL to your main domain. You can find out how many links there are, and also follow the results to visit the pages. You can also go to Google's Advanced Search page and enter your URL into the Page-specific tools, where you can seek links to the given page, or pages similar to it. The latter may give an indication of copycats - after all, copying is the highest form of flattery, remember?
So bearing all these tactics in mind, I ran searches for links: "gps.words2u.net", "www.words2u.net" and "blog.words2u.net", and also searched without the "link:" term and without the quotes. I also tested the advanced search page specific tools, which gave substantially fewer results. The good news - Google actually found some links. The bad news - they were all links I had generated when I posted questions, or sent emails which included my web site. The silver lining - after conducting these searches, there is no vanity left in me. Some other searches: a search for the term "Costa Rican GPS waypoints, tracks, and useful location information" (entered without the quotes), does give my site as the top result. A search for "Costa Rican GPS waypoints, tracks, location information" does the same; "Costa Rica GPS waypoints" puts me as second place; "Costa Rica GPS tracks" has me at number 4. Now if only more people looked for Costa Rican waypoints or GPS tracks, my web site could be a hit!
On a similar trail, I looked for my blog in technorati.com. Again, the good news is they recognize my blog (blog.words2u.net). Alas, a search for "Costa Rica GPS Waypoints" returns some links to GPS devices, accessories and tricks, and even an article on 'How to measure for a proper fitting bra', but not my blog. This is probably because my blog does not yet have many references to these terms. By the way, anyone who need a GPS to fit clothing, is in serious trouble, but I digress.
And now to something completely different I now redirect from www.words2u.net to gps.words2u.net, and renamed my index.php to index.php.replaced04092008, in case you are following the links from the previous blogs on home page design.
Like most humans, we, web site creators, need to be loved. Of course, we are not talking just about romantic love. We want to know whether anyone cares about what we (or our web sites) have to say, offer or sell. Unless we are willing to wait for an award to find out how much (and how many) people REALLY like us (think Sally Fields at the 1985 Academy Awards), we need a faster way to find out.
Here are four easy approaches. Try them and find the one that works best for you.
1. Introspection 2. Vanity 3. Thinking outside the box 4. Outside, looking in
1. Introspection:
Socrates noted that the unexamined life is not worth living. Introspection and the search of truth are existential virtues. In our context, we seek to learn how much people like us by quantifying web site traffic. We can measure visits to our web site from inside the server, if we own it.
First, let's look at web server logs. Apache has two sets, access logs and error logs. Both require root privileges to view. As root, it is possible to see information about each access attempt. Here is a line from the access log, showing the originating IP, date, time, requested page, browser, and OS.
195.2.114.20 - - [29/Mar/2008:14:20:37 -0600] "GET / HTTP/1.0" 200 937 "http://forum.words2u.net/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"
Here is an example of an error log - I get quite a few requests for favicon.ico. I just created it, so there should be less of these errors from now on.
[Mon Mar 31 14:30:24 2008] [error] [client 81.207.89.42] File does not exist: /home/words2u/favicon.ico
This is quite a mouthful, hard to read, analyze or visualize, so I moved to the next option - traffic monitoring software, or log analyzers. I installed webalizer, which gives wonderfully colorful reports on the web. Unfortunately, the data I see is pretty meaningless, or at least it seems so to me - it keeps telling me I had 2 visitors, one from Poland the other from Russia. I guess I should read the manual. I also installed awstats (apt-get install awstat, not hard to do), but could not even tell where to look for the analysis. So, at least for now, this is not really working for me.
There is is a slew of log analysis tools, both free and commercial, which extract information from the log, draw nice graphs, and provide auxiliary information, such as geographic location of the visitor, and so on. The main disadvantage is that someone has to install these programs, then figure out how to use them.
Here is a short list of log analyzers for Linux, thanks to Danny from FLUX:
analog http://www.analog.cx awstats http://awstats.sourceforge.net/ webalizer http://webalizer.org/ webtrax http://www.multicians.org/thvv/webtrax-help.html http-analyze http://http-analyze.org/ awffull http://www.stedee.id.au/awffull sawmill http://www.sawmill.net/ Commercial summary http://www.summary.net/ Commercial visitors http://www.hping.org/visitors/ webtrends http://www.webtrends.com/ Commercial hitbox http://www.hitbox.com/ - now forwarding to another commercial site If you want to check out progress on my web site, go to my Costa Rica GPS wiki (gps.words2u.net) or blog (blog.words2u.net) - and leave a message on this site, telling me what I am doing right or wrong. Next time - vanity.
It is now almost two weeks since my last post. While quite a few people visited the blog and my site, no one made any suggestion regarding which design is better. Truth be told, I am not happy with any of the proposed designs, but the closest one for me was the single table with everything in one page. So I am going to stick with a modification of it from now on. If you are following the links from my last post, please note that the original index page can now be found at http://www.words2u.net/index.original.php - the rest of the links are unchanged for now. So if you want to visit my new home page, check out this link By the way, I am finding that 'twitter' style blogs are a lot easier to do than standard, long entries. I can't promise to stick to the 140 characters length, but I will try as hard as I can. See you soon!
The maxim "You never get a second chance to make a first impression" (attributed to W. Triesthof), has been applied to the art of selling, to good posture, to well being, and, of course, to web design. Still, on the web, this adage is flat wrong. When trying to impress millions of individuals, two eyeballs at a time, there are millions of chances to make a first impression.
Which is why, several month into this blog, I am fixing my home page. It may be too late for the readers of this blog, but I probably still have a chance with the rest of humanity.
My original design had two pages - Home page, and About page. The Home page listed links to the main sections of the site. The About page added a little technical, personal and contact information, and a pitch for contributions.
This design has two problems. First, the home page is scrawny, making the page look empty and uninviting. Second, the contributions pitch, on the About page, is likely to be much less viewed, compared to the Home page. Considering the relative potential importance of contributions to the future development of the site, the pitch should be more visible.
So I tried a second design, combining the two original pages. This time, I only needed one column in the table, the page size is more appealing (IMHO), and there is more room for the contribution pitch (so I can explain the purpose, instead of just asking). I also moved that part higher up in the page. I like this design much better.
However, I ran into yet another web site, and adapted it for my needs as well. This one uses two columns in a table to direct the viewer to accomplish various desired action. It's a bit sparse, but gets all the information across in an organized fashion. This last design looked even better! So, I decided to try and duplicate the last design with the markup tools of PmWiki, with a little bit of color thrown in as well. And here it is. When I figure out how to set the font family and font size, I will improve the page even more. Finally, I copied the page into an existing group, with its headers and footers, here - which resulted in a repetitive visual, since the footer and side bar already include the 'Donate' button.
Sadly, I have to say that the PmWiki page looks a lot better than any of my 'adopt, extend and modify' ones. It may be an indication that professional tools are superior to hand coding, or a testament to my web design prowess. Either way, I will probably stick with it in one way or another. The only problem is that PmWiki's PHP processing takes a few miliseconds more, making the site a bit more sluggish. You are invited to check out the different designs, opine on their relative merits or lack thereof, or offer better ones. Bear in mind, though, that the focus should be on the design, not necessarily the text. Here are the links again: Home and About Combined Table view Table by PmWiki PmWiki Table with header and footer
Today I want to go on about web design in general, and also talk about my web site's home page. The General RantIt is said that the 'well rounded scientist' concept is dead, replaced by two groups - people who know everything about nothing (specialists), and people who know nothing about everything (generalists). Technically, there is no difference between the two (something times nothing is nothing), but from experience, the specialists get paid much, much, better - at least when they have a job. It seems to me that home pages tend to follow the same trend. They either do one thing well, or they try to do everything at once - one stop shopping for all information and entertainment needs. My personal preference is utilitarianism. I want to get in and out quickly, preferably with what I was looking for. Unless, of course, what I want is a leisurely, meandering, tour of content (such as reading the newspapers, or looking for everything for sale on a web site). For example, let's look at some leading web sites. We all hear about the web search wars between Google, Yahoo! and Microsoft (MSN). When one is at the Google web site, there is a text box, a couple of choices, and a search button. When one is at MSN or Yahoo!, he may think he is reading the supermarket special offers handout - there are boxes with text, pictures, weather info, stock market updates, regular links, bold links. Oh, and somewhere in there is also a search box. Does anyone wonder why Google is winning the search war? Ask.com is similar to Google, and I know other sites emulate the same sparse design as well, and I am sure they are much more attractive to searchers than the busy web pages of portals that also do search. Let's take a look at some social networking sites. We have Bebo, MySpace, which look like portals with text ads, pictures, links, videos, etc. and rather busy interface. In contrast, Facebook, LinkedIn and Orkut have a functional interface, essentially a box with login/password fields and some additional information. The real sprawling mess is hidden behind that one simple door, which suits me fine. If I want to search for video clips, I would go to YouTube, Revver, or one of their equivalents. When I visit a social network site, I want to get in, do my thing, and get out. Nothing more, nothing less. Which brings me to my web site and my design preferences. My Web Site (www.words2u.net)According to a comment, my site '...contains almost nothing at all, just a little text... Your blog link is a big empty space.. "Technical" is not a live link...', which does show how much one can observe by just looking. I have already apologized about the blog (WordPress died prematurely after I applied a package update), but let's take a minute to review the rest of the comment. I subscribe to the school of thought that 'if something is worth doing, it is worth doing badly' - in other words, it is better to get something started and improve later, than wait till it is perfect, probably never. So when I decided to go on with the web page, I borrowed a table from another site, slapped on my basic content, added a php line for the dynamic date, and took it live. I wanted a home page that lists and link to the three other site components, and have a second page with a general description. I am not sure if the design I selected is not in itself over-designed. I will see about the front end in the coming week (sorry, other commitments), and in the meantime, if you have suggestions how further simplify the page, please let me know. If I have enough time and suggestions, I will create several pages based on your comments, and let you choose the best one. Then I will use another one, of course, just to show character.
Today I want to build on my last entry. The enlightening comments raised interesting issues regarding hosting a web server, which I would like to address before continuing with my hosting saga. Besides, recycling is good for the soul. So let's get right to it: The Server My server (this information is listed in the GPS.TechnicalInformation page on my web site wiki), uses a Celeron 700 mHz processor with 384 MB RAM (told ya' I like recycling), connected to a shared 512 kbps line (64 kBps, of which about 25% is overhead and losses). While this would be an acceptable pipeline were it always free, performance can be downright sluggish when any bandwidth intensive activity takes place. In other words, if you have a real site, use a data center with decent pipes and decent equipment, not 1990's technology and analog-grade pipeline, like me. In my defense, this is not a real business. If the server makes money, I will move it to an ISP. If it does not, and you keep complaining, I will ask you to send me money. Promise. The Operating System I use Ubuntu Linux on my server (this is listed in the Technical Information page, http://www.words2u.net/pmwiki/?n=GPS.TechnicalInformation). Why Linux? It is free and thus can provide infinite ROI with a penny of profit. I am familiar with it and don't mind learning more about it. My server runs pretty sluggishly with Windows 2000 (I used it for my MCSE classes), and trying to fit Windows Server AND SQL on it is asking for trouble. Windows NT and Windows 2000 are no longer supported by Microsoft, and while I hear BSD and Solaris are solid, both present me with a learning curve, which, with my below average intelligence, and above average age, is a major deterrent. Why Ubuntu? Because Ubuntu's slick desktop, which aims at the uninitiated, has an outstanding package management system, and a very large, active and friendly user community. And it make sense to use the same brand on the server, instead of learning two systems of doing things. RedHat and Novell (Suse) have outstanding products, but these are commercial products aimed at paying corporate clients, which I am not. RedHat does not have an official desktop product at all. Were I a business, I would consider a commercial product, supported by an established vendor. Windows, Solaris RedHat, and Suse all avail service packages, which provide good value. With Ubuntu, you can also get paid support. I did not, and suffered the consequences. During an upgrade (one line command - sudo apt-get upgrade) my blog software (WordPress), stopped working. Until I remove and reinstall it, my choices are a white screen or an error message. I switched to the latter after reading your comments. The Server Software As you, the readers, suggested, it is possible to learn a lot about the system with a few simple tools. A port scan reveals 3 open ports, 21, 22 and 80. Telnet to the ports shows that I use Apache 2, PHP, vsFTPd and OpenSSH. The Plan My plan for the web site is to have a simple home page, which leads to three other components - a wiki, a blog, and a content distribution system. For the wiki and the blog, I wanted to use off the (virtual) shelf products. The content distribution system is still in the planning stages. The wiki I use is PmWiki, the blog is (was) WordPress. The Home Page I considered using a HTML editor to create the home page. I remember using HomeSite, FrontPage, Netscape Editor and BlueFish in the distant past. But for the 3-4 pages it was not worth the trouble. I decided to follow the lead of the software giants - borrow, modify, and extend. Yes, I used a text editor to create some of the pages (you can identify them by the total lack of standard tags, like head, body, HTML, etc.) Luckily for me, most browsers hide my crude HTML, and render the pages properly. Is this good enough for a professional site? No. Malformed web pages are inexcusable, and inexpensive professional tools give the ability to track pages and users, provide unique experience based on geographic location or user demographics. Am I proud of my web site? Like a parent, I am proud of my child even if others find it a bit slow, a bit ugly, or suffering from attention deficit disorder. Remember, the goal was to use this as a learning experience, and this is only the first step in a long march. And I already had a chance to learn from your comments.
The operating system was the first thing I had to choose for my server. The choice was between Free/Open Source software (FOSS) and a proprietary system (Windows or Unix). Unix systems include Linux, BSD, Solaris, and a several proprietary Unices. Windows variants include servers based on NT, XP, and so on. If you are thinking of deploying a server, this is your first decision, too. My view is that one should choose the system one knows - or would like to know. Because making the most of the server is more important than the OS used, and skills and knowledge affect security and ease of use much more than the differences between the operating systems. And that's from someone who is both an MCP and a long time user of Windows, Linux and MacOS. Now, if you are a veteran of the operating system holy wars, you are probably at the boiling point by now, so let me elaborate. Yes, Windows and Unix are not the same. Under the hood are very different beasts. But as a user, I am more interested in what affects me, not in the way the software handles threads on a multicore processor. To me, Unix/Linux is a command line system with grapic user interfaces added on top, while Windows is a graphical system with command line utilities attached to it. Six of one, half dozen of the other. Traceroute versus Tracert. Proprietary systems cost money, but for that you get documentation and some hand holding and tech support. With open source you have community documentation and advice, so you are dependent on the kindness of strangers, some of whom have pretty rough edges. Bugs in FOSS are easier and faster to modify and repair. With proprietary systems you are dependent on the vendors, and your bug may be number 328 on the list. But you need to know how to code, or be lucky enough to find a responsive developer involved with the software in question (yes, it CAN happen). But the most important aspect of running a server is performance and security, and these depend more on the server administrator than on the OS. Securing and optimizing servers require learning the system in depth, applying updates and patches on a regular basis, modifying default configurations, removing and adding components, etc. It takes attention to details, constant vigilance and endless tweaking. You'd better like doing it in your OS, because you'll be doing it - or worrying about it - every day and every night. So by now you probably wonder which system I chose. Here is an assignment - find it out... The web site is www.words2u.net - use one of the tools that report the underlying OS, or read through the few pages that are already finished - the Tech details are there somewhere.
In this blog I will review the lessons, dilemmas and occasional miseries associated with hosting a server. I will describe in detail my hosting journey, from conception, through planning and implementation, to (I hope) triumph.
The importance of the web as a tool of self expression is quite obvious. Blogs, social networking and collaboration sites are hugely popular. These sites provide tools to facilitate creativity, and make it easy to contribute without knowledge of programming or design. Yet, the great majority of these sites are run by companies and media organizations, and users are tenants, not landlords. For owning and controlling web pages, one has to use a web hosting provider or use a co-location facility one's own server. Let's consider the available hosting solutions. Most residential Internet providers, be they dial-up, cable or DSL, provide space for personal web pages at no extra cost. Shared hosting costs a few dollars monthly. Dedicated servers are available for lease at well under a hundred dollars a month. For this amount, one gets (or at least should get) quality hardware, fast and reliable backbone connectivity, server administration tools, site monitoring, technical support, and additional paid services. All these hosting options keep the noise, the heat, and the reboots far away from the user. So, you must be asking, with so many cheap and easy alternatives around, why do I want to host my own server, rather than lease one? For quite a few reasons, actually. Because hosting my own server is a challenge and a learning experience, because it lets one have things exactly the way I like them, and because it gives me the freedom to mess things up, repeatedly, without serious consequences.
By hosting my own server, I can use any hardware I choose, run any OS I like, upgrade, downgrade, or replace hardware, operating systems and applications as I see fit, as often as I desire, without driving my ISP's technical staff up the wall. I can use any program and every available port, without the risk of violating my service agreement. No one (except the guards who admits me to the co-location facility) will ever know how many times I had to re-install the system . And with all the fidgeting, I will become a better server administrator, learn about the business and technology of hosting, and, last but not least, be able to write about it and learn from the experience of people with a similar interest. If, like me, you enjoy tinkering, trying out different software packages until you find the one with the right combination of features, stability and usability, then hosting your own server is the best way. Provided, of course, that you can find a colocation facility that is not too far and let's you in 24x7. I hope you keep up with my progress on this blog, and help me along with comments, suggestions and advice. My web site will be up soon - the URL is www.words2u.net. Feel free to visit the site, and tell me how I am doing. I would love to hear from you.
| |
|
|