Posted: . At: 12:22 PM. This was 12 months ago. Post ID: 17889
Page permalink. WordPress uses cookies, or tiny pieces of information stored on your computer, to verify who you are. There are cookies for logged in users and for commenters.
These cookies expire two weeks after they are set.

Parse useful information from an Apache server log easily using awk.

Awk is a very useful tool on Linux, this is great for accessing data from an Apache server access log. Below I am accessing a list of all URLs accessed on a website and filtering out a certain keyword using grep.

┗━━━━━━━━━━┓ john@localhost ~/Downloads
           ┗━━━━━━━━━━━━━╾ ╍▷ awk '{print $7}' sslaccesslog_securitronlinux.com_3_27_2023 | grep youtube
/debian-testing/how-to-download-a-youtube-video-with-metadata/
/debian-testing/how-to-download-a-youtube-video-thumbnail-on-linux/
/bejiitaswrath/make-a-playlist-of-youtube-videos-and-play-it-with-mpv/
/bejiitaswrath/make-a-playlist-of-youtube-videos-and-play-it-with-mpv/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/debian-testing/how-to-download-a-youtube-video-thumbnail-on-linux/
/debian-testing/how-to-download-a-youtube-video-thumbnail-on-linux/
/bejiitaswrath/very-nice-user-script-youtube-shorts-redirect/
/debian-testing/more-useful-tricks-with-youtube-dl-on-linux/
/debian-testing/how-to-download-a-youtube-video-with-metadata/
/bejiitaswrath/very-nice-user-script-youtube-shorts-redirect/
/bejiitaswrath/very-nice-user-script-youtube-shorts-redirect/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/debian-testing/how-to-download-a-youtube-video-with-metadata/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/bejiitaswrath/very-nice-user-script-youtube-shorts-redirect/
/debian-testing/youtube-dl-automatic-update-not-working/
/debian-testing/how-to-download-a-youtube-video-with-metadata/
/bejiitaswrath/make-a-playlist-of-youtube-videos-and-play-it-with-mpv/
/bejiitaswrath/make-a-playlist-of-youtube-videos-and-play-it-with-mpv/
/debian-testing/how-to-download-a-youtube-video-with-metadata/
/debian-testing/how-to-download-a-youtube-video-with-metadata/
/debian-testing/more-useful-tricks-with-youtube-dl-on-linux/
/bejiitaswrath/make-a-playlist-of-youtube-videos-and-play-it-with-mpv/
/bejiitaswrath/watch-youtube-videos-with-the-mpv-movie-player-this-is-a-very-simple-process/
/debian-testing/how-to-view-videos-on-youtube-and-not-count-as-a-view/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/bejiitaswrath/very-nice-user-script-youtube-shorts-redirect/
/bejiitaswrath/a-nice-youtube-layout-fix-for-the-website-in-2022/
/bejiitaswrath/very-nice-user-script-youtube-shorts-redirect/
/debian-testing/how-to-view-videos-on-youtube-and-not-count-as-a-view/
/debian-testing/more-useful-tricks-with-youtube-dl-on-linux/
/bejiitaswrath/install-a-very-useful-youtube-app-on-linux-and-watch-youtube-without-a-browser/
/bejiitaswrath/how-to-block-related-videos-and-the-recommended-movies-section-on-youtube/
/bejiitaswrath/feed-a-youtube-video-into-ffmpeg-directly-with-youtube-dl/
/debian-testing/how-to-download-a-youtube-video-with-metadata/
/debian-testing/how-to-download-a-youtube-video-with-metadata/feed/
/bejiitaswrath/very-nice-user-script-youtube-shorts-redirect/
/debian-testing/more-useful-tricks-with-youtube-dl-on-linux/

Using awk ‘{print $7}’ will filter out everything except the URL.

Use this example to filter out everything except the referrers that lead users to your website.

┗━━━━━━━━━━┓ john@localhost ~/Downloads
           ┗━━━━━━━━━━━━━╾ ╍▷ awk '$11 !~ "securitronlinux.com" && $11 !~ "-" {print $11}' sslaccesslog_securitronlinux.com_3_27_2023 | head -n 30
"https://yandex.ru/"
"https://www.google.si/"
"https://www.google.com/"
"https://www.google.si/"
"https://www.google.si/"
"https://www.google.com/"
"https://steamcommunity.com/"
"https://steamcommunity.com/"
"https://www.google.co.za/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://search.brave.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.ca/"
"https://cn.bing.com/"
"https://www.google.com/"
"https://www.google.hu/"
"https://www.google.hu/"
"https://duckduckgo.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"
"https://www.google.com/"

Find all visitors that are using Linux and print information about them. This prints part of the user agent, the date and time of the visit, and the URL they visited.

┗━━━━━━━━━━┓ john@localhost ~/Downloads
           ┗━━━━━━━━━━━━━╾ ╍▷ awk '$13 !~ "Linux" && $11 !~ "-" {print $12, $13, $4, $5, $7}' sslaccesslog_securitronlinux.com_3_27_2023 | head -n 30
"Mozilla/5.0 (Windows [27/Mar/2023:07:03:01 -0500] /wp-content/uploads/2018/12/maps.jpg
"Mozilla/5.0 (Windows [27/Mar/2023:07:05:04 -0500] /wp-content/uploads/2017/04/Sparky-Linux-2017-04-13-16-44-49.png
"Mozilla/5.0 (Windows [27/Mar/2023:07:06:12 -0500] /wp-content/uploads/2016/12/Other-Linux-3.x-kernel-64-bit-2016-12-14-20-15-05.png
"Mozilla/5.0 (Windows [27/Mar/2023:07:06:12 -0500] /wp-content/uploads/2016/12/Other-Linux-3.x-kernel-64-bit-2016-12-14-20-15-05.png
"Mozilla/5.0 (X11; [27/Mar/2023:07:06:35 -0500] /installing-and-playing-the-classic-pc-doom-game-on-linuxubuntu/
"Mozilla/5.0 (Windows [27/Mar/2023:07:07:13 -0500] /battlefield/how-to-get-the-developer-mode-working-in-lost-alpha-dc-1-4007/
"Mozilla/5.0 (Windows [27/Mar/2023:07:07:14 -0500] /battlefield/how-to-get-the-developer-mode-working-in-lost-alpha-dc-1-4007/
"Mozilla/5.0 (Windows [27/Mar/2023:07:14:13 -0500] /wp-content/uploads/2020/06/caja.png
"Mozilla/5.0 (X11; [27/Mar/2023:07:14:20 -0500] /debian-testing/install-mac-osx-fonts-on-linux-easily/
"Mozilla/5.0 (X11; [27/Mar/2023:07:15:15 -0500] /bejiitaswrath/a-few-ways-to-list-disk-information-in-linux-mint/
"Mozilla/5.0 (Windows [27/Mar/2023:07:15:21 -0500] /it/connecting-to-linux-over-rdp-from-a-mac-is-very-easy-and-fun/
"Mozilla/5.0 (Windows [27/Mar/2023:07:15:30 -0500] /product/how-to-enable-developer-mode-in-chatgpt-and-get-better-prompt-results/
"Mozilla/5.0 (Windows [27/Mar/2023:07:16:32 -0500] /debian-testing/how-to-list-only-symbolic-links-in-a-directory-on-a-linux-filesystem/
"Mozilla/5.0 (Windows [27/Mar/2023:07:16:43 -0500] /wp-content/uploads/2016/08/unixware.jpg
"Mozilla/5.0 (X11; [27/Mar/2023:07:18:49 -0500] /bejiitaswrath/some-useful-tips-for-the-linux-mate-desktop/
"Mozilla/5.0 (Windows [27/Mar/2023:07:19:07 -0500] /wp-includes/js/jquery/jquery.min.js?ver=3.6.1
"Mozilla/5.0 (Windows [27/Mar/2023:07:19:07 -0500] /wp-content/plugins/contact-form-7/includes/js/index.js?ver=5.7.5.1
"Mozilla/5.0 (Windows [27/Mar/2023:07:19:07 -0500] /wp-includes/js/wp-emoji-release.min.js?ver=6.1.1
"Mozilla/5.0 (Windows [27/Mar/2023:07:19:07 -0500] /wp-content/uploads/astra-addon/astra-addon-63fe8fb776f051-19962211.js?ver=4.0.1
"Mozilla/5.0 (Windows [27/Mar/2023:07:19:07 -0500] /wp-content/plugins/contact-form-7/includes/swv/js/index.js?ver=5.7.5.1
"Mozilla/5.0 (Windows [27/Mar/2023:07:19:24 -0500] /debian-testing/overlay-two-videos-on-top-of-another-with-ffmpeg/
"Mozilla/5.0 (Macintosh; [27/Mar/2023:07:20:24 -0500] /product/how-to-enable-developer-mode-in-chatgpt-and-get-better-prompt-results/
"Mozilla/5.0 (compatible; [27/Mar/2023:07:21:13 -0500] /feed/

List all IP addresses for visitors that ended up with 403 errors.

┗━━━━━━━━━━┓ john@localhost ~/Downloads
           ┗━━━━━━━━━━━━━╾ ╍▷ awk '($9 ~ /403/)' sslaccesslog_securitronlinux.com_3_27_2023 | awk '{print $1,$7}' | uniq -c | sort -r

Another way to list all 403 errors. This will list the IP addresses and the referrer URL. You do not need to use cat at all. This is amazing.

┗━━━━━━━━━━┓ john@localhost ~/Downloads
           ┗━━━━━━━━━━━━━╾ ╍▷ awk '($9 ~ /403/)' sslaccesslog_securitronlinux.com_3_27_2023 | awk '{print $3 " " $1 " " $7 " " $6 " " $8 " " $9 " " $10 " " $11}' | uniq -c | sort -r | head -n 30

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.