import json# calls the JSON
here is the first function script:
data acquisition Calculation of
# above is mainly to lower can get all the data
outerlink_num = int (data1[‘count’]) # chain for the total number of
range_page_num = outerlink_num//100+2# per page 100, the chain total page number
all_data = 
def url_outerlink_anchor (URL): # Webmaster Platform
html1 = urllib.request.urlopen (‘贵族宝贝zhanzhang.baidu贵族宝贝/inbound/detail? D=%s& pagesize=100& page=1’%url).Read (.Decode) (‘utf-8’) # opened the first chain list
note: this section of the main function to obtain all data query to love Shanghai Webmaster Platform outside the chain of tools in the site outside the chain and write all_data in the list.
import urllib.request# calls the urllib.request
for I in range_page_num_list:#for data extraction and appended to the a> cycle
to the chain data query for love Shanghai webmaster tools outside the chain of tools in the analysis, specially wrote a Python script, by direct get Webmaster Platform data stores the data link page + frequency analysis station is derived, the chain + root domain under the domain name number of links, anchor text + the number of anchor text, of course, since you can get to the analysis of the data can be more dimensions of the data, the script of the above three aspects of the analysis, if other needs can be extended.
data1 = json.loads (html1) # converted to JSON format
code and instructions posted below:
range_page_num_list = range (1, range_page_num) # such as calculation after range_page_num=4, range_page_num_list = [1,2,3], 100 per page chain is 3 pages