Excellent device to creep my website and also aid me locate dead link and also unlinked documents

I have a rather large heritage website with essentially hundreds of PDFs that are occasionally making up in a data source, yet usually are simply web links on the web page, and also are saved in the majority of every directory site on the website.

I have created a php spider to adhere to all the web links on my website, and afterwards I am contrasting that versus a dump of the directory site framework, yet exists something less complicated?

0
2019-05-07 09:48:13
Source Share
Answers: 5

There are numerous items from Microsys, specifically their A1 Sitemap Generator and also A1 Website Analyzer that will certainly creep your internet site and also record every little thing you can perhaps visualize concerning it.

That consists of busted web links, yet additionally a table sight of all your web pages so you can contrast points like the same and also meta summary tags, nofollow web links, meta noindex on pages, and also a great deal of conditions that simply require an eagle eye and also a fast hand to deal with. </p></div> <div class="votes-answer green"> <div class="vote-count" itemprop="upvoteCount">0</div><i class="fa fa-heart"></i> </div> <div class="clearfix"></div> <div class="action-time"> <span itemprop="author" itemscope itemtype="http://schema.org/Person"><span itemprop="name"><a href="/profile/8530" rel="noopener noreferrer nofollow" target="_blank">Evgeny</a></span></span> <span title="2019-05-13 15:15:28"> 2019-05-13 15:15:28</span><time class="hidden" itemprop="dateCreated" datetime="2019-05-13T03:05:28+02:00">2019-05-13T03:05:28+02:00</time> </div> <a class="aa-link" href="/source/39852" target="_blank" rel="noopener noreferrer nofollow">Source</a> <a itemprop="url" class="s-link" href="https://askdev.io/questions/4013/excellent-device-to-creep-my-website-and-also-aid-me-locate#39852" title="Share">Share</a> <div class="clearfix"></div> </div> </div> </div> <div class="answer" id="15295" itemscope itemtype="http://schema.org/Answer" itemprop="suggestedAnswer"> <div class="answer-row"> <div class="answer-text"> <div class="desc" itemprop="text"> <p>I've made use of <a href="http://home.snafu.de/tilman/xenulink.html" rel="noopener noreferrer nofollow">Xenu's Link Sleuth</a>. It functions rather well, simply make certain not to DOS on your own! </p></div> <div class="votes-answer green"> <div class="vote-count" itemprop="upvoteCount">0</div><i class="fa fa-heart"></i> </div> <div class="clearfix"></div> <div class="action-time"> <span itemprop="author" itemscope itemtype="http://schema.org/Person"><span itemprop="name"><a href="/profile/1525" rel="noopener noreferrer nofollow" target="_blank">plntxt</a></span></span> <span title="2019-05-09 04:29:19"> 2019-05-09 04:29:19</span><time class="hidden" itemprop="dateCreated" datetime="2019-05-09T04:05:19+02:00">2019-05-09T04:05:19+02:00</time> </div> <a class="aa-link" href="/source/15295" target="_blank" rel="noopener noreferrer nofollow">Source</a> <a itemprop="url" class="s-link" href="https://askdev.io/questions/4013/excellent-device-to-creep-my-website-and-also-aid-me-locate#15295" title="Share">Share</a> <div class="clearfix"></div> </div> </div> </div> <div class="answer" id="15206" itemscope itemtype="http://schema.org/Answer" itemprop="suggestedAnswer"> <div class="answer-row"> <div class="answer-text"> <div class="desc" itemprop="text"> <p>If you are making use of windows 7 the most effective device is IIS7's SEO Toolkit 1.0. It is free and also you can download it absolutely free. </p> <p>The device will certainly check any kind of website and also inform you where every one of the dead links are, what web pages require to long to load, what web pages have missing out on titles, replicate titles, very same for search phrases and also summaries, and also what web pages have actually damaged HTML. </p></div> <div class="votes-answer green"> <div class="vote-count" itemprop="upvoteCount">0</div><i class="fa fa-heart"></i> </div> <div class="clearfix"></div> <div class="action-time"> <span itemprop="author" itemscope itemtype="http://schema.org/Person"><span itemprop="name"><a href="/profile/1832" rel="noopener noreferrer nofollow" target="_blank">Ben Hoffman</a></span></span> <span title="2019-05-09 04:09:45"> 2019-05-09 04:09:45</span><time class="hidden" itemprop="dateCreated" datetime="2019-05-09T04:05:45+02:00">2019-05-09T04:05:45+02:00</time> </div> <a class="aa-link" href="/source/15206" target="_blank" rel="noopener noreferrer nofollow">Source</a> <a itemprop="url" class="s-link" href="https://askdev.io/questions/4013/excellent-device-to-creep-my-website-and-also-aid-me-locate#15206" title="Share">Share</a> <div class="clearfix"></div> </div> </div> </div> <div class="answer" id="15200" itemscope itemtype="http://schema.org/Answer" itemprop="suggestedAnswer"> <div class="answer-row"> <div class="answer-text"> <div class="desc" itemprop="text"> <p>I'm a large follower of <a href="http://www.linklint.org/" rel="noopener noreferrer nofollow"><xx_strong>linklint</xx_strong></a> for linkchecking huge fixed websites, if you have a unix command line around (I've made use of on linux, MacOS, and also FreeBSD). See their website for installment guidelines. As soon as mounted, I create a documents called <code>check.ll</code> and also do: </p> <pre><code>linklint @check.ll </code></pre> <p>Here's what my check.ll documents resembles </p> <pre><code># linklint -doc . -delay 0 -http -htmlonly -limit 4000 -net -host www.example.com -timeout 10 </code></pre> <p>That does a crawl of <code>www.example.com</code> and also creates HTML documents with cross - referenced records wherefore is damaged, missing out on, etc </p></div> <div class="votes-answer green"> <div class="vote-count" itemprop="upvoteCount">0</div><i class="fa fa-heart"></i> </div> <div class="clearfix"></div> <div class="action-time"> <span itemprop="author" itemscope itemtype="http://schema.org/Person"><span itemprop="name"><a href="/profile/165" rel="noopener noreferrer nofollow" target="_blank">artlung</a></span></span> <span title="2019-05-09 04:08:27"> 2019-05-09 04:08:27</span><time class="hidden" itemprop="dateCreated" datetime="2019-05-09T04:05:27+02:00">2019-05-09T04:05:27+02:00</time> </div> <a class="aa-link" href="/source/15200" target="_blank" rel="noopener noreferrer nofollow">Source</a> <a itemprop="url" class="s-link" href="https://askdev.io/questions/4013/excellent-device-to-creep-my-website-and-also-aid-me-locate#15200" title="Share">Share</a> <div class="clearfix"></div> </div> </div> </div> <div class="answer" id="13290" itemscope itemtype="http://schema.org/Answer" itemprop="suggestedAnswer"> <div class="answer-row"> <div class="answer-text"> <div class="desc" itemprop="text"> <p>Try <a href="http://validator.w3.org/docs/checklink.html" rel="noopener noreferrer nofollow">W3C's open source tool Link Checker</a>. You can utilize it online or install it in your area </p></div> <div class="votes-answer green"> <div class="vote-count" itemprop="upvoteCount">0</div><i class="fa fa-heart"></i> </div> <div class="clearfix"></div> <div class="action-time"> <span itemprop="author" itemscope itemtype="http://schema.org/Person"><span itemprop="name"><a href="/profile/1116" rel="noopener noreferrer nofollow" target="_blank">mvark</a></span></span> <span title="2019-05-08 21:00:15"> 2019-05-08 21:00:15</span><time class="hidden" itemprop="dateCreated" datetime="2019-05-08T09:05:15+02:00">2019-05-08T09:05:15+02:00</time> </div> <a class="aa-link" href="/source/13290" target="_blank" rel="noopener noreferrer nofollow">Source</a> <a itemprop="url" class="s-link" href="https://askdev.io/questions/4013/excellent-device-to-creep-my-website-and-also-aid-me-locate#13290" title="Share">Share</a> <div class="clearfix"></div> </div> </div> </div> </div> <div class="similar"> <p>Related questions</p> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/4000/what-alternatives-are-around-for-an-embeddable-wysiwig-text" target="_blank">What alternatives are around for an embeddable WYSIWIG text editor?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3989/can-i-make-use-of-rdfa-with-html5" target="_blank">Can I make use of RDFa with HTML5?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3982/just-how-would-certainly-i-deal-with-moving-to-html5-without" target="_blank">Just how would certainly I deal with moving to HTML5 without estranging most of my customer target market?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3961/what-is-the-easiest-lightest-arrangement-to-get-a-standard" target="_blank">What is the easiest/lightest arrangement to get a standard LAMP pile arrangement for development?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3955/just-how-can-i-increase-the-traffic-to-my-website" target="_blank">Just how can I increase the traffic to my website?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3946/just-how-do-i-take-care-of-having-way-too-many-links-on-a" target="_blank">Just how do I take care of having way too many links on a page as a result of my menu</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3937/what-is-the-google-chrome-equal-to-firebug" target="_blank">What is the Google Chrome equal to Firebug?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3918/what-are-some-wonderful-sources-for-understanding-apache" target="_blank">What are some wonderful sources for understanding Apache management?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3877/is-a-computer-animated-favicon-ico-a-negative-suggestion" target="_blank">Is a computer animated favicon.ico a negative suggestion? Exist trendy uses computer animated favicon.ico around?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3872/what-are-some-excellent-sources-for-creating-personal" target="_blank">What are some excellent sources for creating personal privacy plans and also regards to use?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3868/just-how-to-pick-in-between-web-hosting-and-also-cloud" target="_blank">Just how to pick in between web hosting and also cloud hosting?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3865/just-how-do-you-straight-iphone-android-internet-browser-to" target="_blank">Just how do you straight iPhone/Android internet browser to m.example.com?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3836/where-can-i-safetly-search-domain-name-whois-without" target="_blank">Where can I safetly search domain name whois without bothering with the internet search engine parking on the domain name quickly after the search?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3830/just-how-do-i-arrangement-authentication-on-a-details-folder" target="_blank">Just how do I arrangement authentication on a details folder using.htaccess?</a> </div> <div> <div class="votes-question accepted"> <div class="vote-count">0</div> </div><a href="https://askdev.io/questions/3734/if-i-require-https-ssl-security-on-my-internet-site-does-it" target="_blank">If I require HTTPS/ SSL security on my internet site, does it issue that I get my certification from?</a> </div> </div> </div> </div> </div> </div> <footer class="footer"> <div class="container"> <div style="margin-bottom: 10px;" class="select_lng"><strong>Language: </strong> <span class="label label-success" style="margin:0px; font-size:100%"><span class="flag-icon flag-icon-us"></span></span>  <a href="https://askdev.io/cn/questions/4013"><span class="flag-icon flag-icon-cn"></span></a>  <a href="https://askdev.io/de/questions/4013"><span class="flag-icon flag-icon-de"></span></a>  <a href="https://askdev.io/es/questions/4013"><span class="flag-icon flag-icon-es"></span></a>  <a href="https://askdev.io/fr/questions/4013"><span class="flag-icon flag-icon-fr"></span></a>  <a href="https://askdev.io/hi/questions/4013"><span class="flag-icon flag-icon-in"></span></a>  <a href="https://askdev.io/id/questions/4013"><span class="flag-icon flag-icon-id"></span></a>  <a href="https://askdev.io/it/questions/4013"><span class="flag-icon flag-icon-it"></span></a>  <a href="https://askdev.io/jp/questions/4013"><span class="flag-icon flag-icon-jp"></span></a>  <a href="https://askdev.io/kr/questions/4013"><span class="flag-icon flag-icon-kr"></span></a>  <a href="https://askdev.io/nl/questions/4013"><span class="flag-icon flag-icon-nl"></span></a>  <a href="https://askdev.io/pt/questions/4013"><span class="flag-icon flag-icon-pt"></span></a>  <a href="https://askdev.io/ru/questions/4013"><span class="flag-icon flag-icon-ru"></span></a>  <a href="https://askdev.io/tr/questions/4013"><span class="flag-icon flag-icon-tr"></span></a>  <a href="https://askdev.io/ua/questions/4013"><span class="flag-icon flag-icon-ua"></span></a></div> </div> <div class="container"> <div class="pull-left"> <div class="license"> licensed under <a href="https://creativecommons.org/licenses/by-sa/3.0/" rel="nofollow license">cc by-sa 3.0</a> with attribution. </div> </div> <div class="pull-right logo"> <a class="hidden-xs mail" href="mailto:info@askdev.io">info@askdev.io</a> <a href="#"> <div class="name"><span>AskDev.io</span></div> </a> </div> </div> </footer> <script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.15.6/highlight.min.js"></script> <script>hljs.initHighlightingOnLoad();</script> <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.4.1/jquery.slim.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.4.1/js/bootstrap.min.js"></script> </body> </html>