Teach you to analyze the characteristics of spider crawl about site operation


: the first need to confirm their own virtual host or server opens the log function, recording function of WWW log is a virtual control panel of the general business space, and provide the webmaster to download and analysis, below is the editor to use a log style, because the order and manner of each different space business operation not only the same, here only as a reference.

first click on the map or a second picture into the interface, click on the download weblog log will appear in Figure three figure four interface, each TXT in Figure four are named after the year month day, and record the size of the log, click view can see detailed information.

parameter 2: grasping the content, GET said the meaning of grasping followed by /index.html is to grab the page, here said the spider to grab the page, if the GET is behind / said spiders crawl anything, then need to cause website >

on the site of the daily running and maintenance, we often need to understand the situation of the space through the spider crawling www log, and make adjustments to the usual work, the following will step by step, let you fully understand the analysis log settings and capture characteristics of spider lets you fully understand the meaning of each parameter and as to adjust and modify the reference.


parameter 1: This is the love of the Shanghai spiders to crawl the content of the time, the time and computer time is 8 hours, which is mainly used log time Greenwich time, with Beijing time difference of 8 hours; that you will need time and 8 hours is to Beijing time, so the parameters shown in Figure 1 the spider to grab the time is May 23rd 13 8.

second: found traces of spiders in the code, because a TXT log is hundreds of K, thousands of lines, so each check is not realistic, we need to fully understand the characteristics and the function of query spider fast positioning, because the Spider code is spider, so when spider will search out visit all the spiders, such as love Shanghai, noble baby, 360 and so on, and the characteristics of love spiders in Shanghai is Baiduspider, here we focus on the love of spiders in Shanghai.


third: find love Shanghai spiders crawl and walk in the future for each parameter, the editor explained and the corresponding case is described (see diagram).



we use Notepad to open the download of the TXT document, and by editing the search function (Figure five) to search quickly, enter Baidu in the search box, and find the love Shanghai spider crawling code (Figure six) can be confirmed by

Leave a Reply

Your email address will not be published. Required fields are marked *