Find the number of total unique visitors:
cat access.log | awk '{print $1}' | sort | uniq -c | wc -l
2. Find the number of unique visitors today:
cat access.log | grep `date '+%e/%b/%G'` | awk '{print $1}' | sort | uniq -c | wc -l
3. Find the number of unique visitors this month:
cat access.log | grep `date '+%b/%G'` | awk '{print $1}' | sort | uniq -c | wc -l
4. Find the number of unique visitors on arbitrary date – for example March 22nd of 2007:
cat access.log | grep 22/Mar/2007 | awk '{print $1}' | sort | uniq -c | wc -l
5. (based on #3) Find the number of unique visitors for the month of March:
cat access.log | grep Mar/2007 | awk '{print $1}' | sort | uniq -c | wc -l
6. Show the sorted statistics of “number of visits/requests” “visitor’s IP address”:
cat access.log | awk '{print "requests from " $1}' | sort | uniq -c | sort
7. Similarly by adding “grep date”, as in above tips, the same statistics will be produces for “that” date:
cat access.log | grep 26/Mar/2007 | awk '{print "requests from " $1}' | sort | uniq -c | sort
Most Common 404s (Page Not Found)
cut -d'"' -f2,3 /var/log/apache/access.log | awk '$4=404{print $4" "$2}' | sort | uniq -c | sort -rg
2 - Count requests by HTTP code
cut -d'"' -f3 /var/log/apache/access.log | cut -d' ' -f2 | sort | uniq -c | sort -rg
3 - Largest Images
cut -d'"' -f2,3 /var/log/apache/access.log | grep -E '\.jpg|\.png|\.gif' | awk '{print $5" "$2}' | sort | uniq | sort -rg
4 - Filter Your IPs Requests
tail -f /var/log/apache/access.log | grep
5 - Top Referring URLS
cut -d'"' -f4 /var/log/apache/access.log | grep -v '^-#39; | grep -v '^http://www.yoursite.com' | sort | uniq -c | sort -rg
6 - Watch Crawlers Live
For this we need an extra file which we'll call bots.txt. Here's the contents:
Bot
Crawl
ai_archiver
libwww-perl
spider
Mediapartners-Google
slurp
wget
httrack
This just helps is to filter out common user agents used by crawlers.
Here's the command:
tail -f /var/log/apache/access.log | grep -f bots.txt
7 - Top Crawlers
This command will show you all the spiders that crawled your site with a count of the number of requests.
cut -d'"' -f6 /var/log/apache/access.log | grep -f bots.txt | sort | uniq -c | sort -rg
How To Get A Top Ten
You can easily turn the commands above that aggregate (the ones using uniq) into a top ten by adding this to the end:
| head
That is pipe the output to the head command.
Simple as that.
Zipped Log Files
If you want to run the above commands on a logrotated file, you can adjust easily by starting with a zcat on the file then piping to the first command (the one with the filename).
So this:
cut -d'"' -f3 /var/log/apache/access.log | cut -d' ' -f2 | sort | uniq -c | sort -rg
Would become this:
zcat /var/log/apache/access.log.1.gz | cut -d'"' -f3 | cut -d' ' -f2 | sort | uniq -c | sort -rg
No comments:
Post a Comment