Use the robots.txt file to prevent indexing of various areas of your site.

Using a robots.txt file is very useful for preventing indexing of various areas of your website that you do not wish to be indexed by a web crawler. The example below will disallow access by web crawlers to a few directories under the root of the public_html folder. This can be useful for a large site, as unneeded folders do not waste crawl cycles, and only the important pages are indexed instead.

robots.txt
1 2 3 4 5 6 7 8 9 10 11 12 13	User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /cache/ Disallow: /class/ Disallow: /images/ Disallow: /include/ Disallow: /install/ Disallow: /kernel/ Disallow: /language/ Disallow: /templates_c/ Disallow: /themes/ Disallow: /uploads/

robots.txt

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /cache/
Disallow: /class/
Disallow: /images/
Disallow: /include/
Disallow: /install/
Disallow: /kernel/
Disallow: /language/
Disallow: /templates_c/
Disallow: /themes/
Disallow: /uploads/

The example below is very good for a WordPress website.

User-agent *
Disallow: /wp-admin/
Disallow: /wp-content/plugins/
Disallow: /wp-content/themes/
Disallow: /readme.html
Disallow: /refer/
Allow /wp-admin/admin-ajax.php
Sitemap: sitemap.xml

This is a great way to optimise Google crawling of the website and prevent Google from wasting time indexing unnecessary files.

And this is yet another version that disallows certain folders. This may be used to allow certain folders and then disallow others.

# Default robots file version:2
User-agent: *
Disallow: /calendar/action*
Disallow: /events/action*
Allow: /*.css
Allow: /*.js
Disallow: /*?
Crawl-delay: 3

And finally, this is how to block certain bots from crawling your website.

#
# Disallow Money for Google News
User-agent: Googlebot-News
Disallow: /tmoney/*
#
# Allow Adsense
User-agent: Mediapartners-Google
Disallow:
#
#
User-agent: CrystalSemanticsBot
Disallow: /
#
User-agent: GPTBot
Disallow: /
#

Or use this in your .htaccess file.

.htaccess
1 2 3	RewriteEngine On RewriteCond %{HTTP_USER_AGENT} ^.(Baiduspider\|HTTrack\|Yandex).$ [NC] RewriteRule .* – [F,L]

Leave a Comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

S3 bucket find Using Google Dorks 🌍 Here are a couple of examples: site:http://amazonaws.com inurl:". s3.amazonaws.com/" site:http://s3.amazonaws.com intitle:index.

how to run the command?

Oh WordPress ruined the command formatting. I think you can figure this out.

Both things can also be achieved with just yt-dlp. Print chapter titles: yt-dlp --print "%(chapters.:.title)#l" https://www.youtube.com/watch?v=o1jv509M8Zg Print 15 latest videos:…

I used to administer a nextstep machine (color pizza slab, 24MB ram). Wrote display postscript demos and screensavers for it.…

yeah good example of how simple raycasting a-la wolfenstein is i write fun stuf in bash but i'm less cool…

mpv --config=no --audio-device=pulse/alsa_output.usb-0c76_USB_PnP_Audio_Device-00.analog-stereo --quiet --vo=tct --lavfi-complex='[aid1]asplit[ao][a1];[a1]showcqt[vo]' /media/sdc2/Projects/Music/NSF/amazingmusic/NSF_Archive/Chiptune_Artists/Originals/* showcqt ftw! thanks!

Hi.. does the launcher comes in english..also in game options not working Thank you

I got to launch in english...but the launched in Russian...also option menu is not active in game...could you please help...thank…

Hi..I am installing it right now...thank you for replay One question...is it available with English language Thank you

Maybe, easier than Windows XP. IE is spread across the whole OS it seems with the Active Desktop crap. But…

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31