Python3 script take love Shanghai Webmaster Platform chain data analysis

import json# calls the JSON

here is the first function script:

data acquisition Calculation of

# above is mainly to lower can get all the data

outerlink_num = int (data1[‘count’]) # chain for the total number of



range_page_num = outerlink_num//100+2# per page 100, the chain total page number

all_data = []

def url_outerlink_anchor (URL): # Webmaster Platform

html1 = urllib.request.urlopen (‘贵族宝贝zhanzhang.baidu贵族宝贝/inbound/detail? D=%s& pagesize=100& page=1’%url).Read (.Decode) (‘utf-8’) # opened the first chain list

note: this section of the main function to obtain all data query to love Shanghai Webmaster Platform outside the chain of tools in the site outside the chain and write all_data in the list.

import urllib.request# calls the urllib.request

for I in range_page_num_list:#for data extraction and appended to the a> cycle

to the chain data query for love Shanghai webmaster tools outside the chain of tools in the analysis, specially wrote a Python script, by direct get Webmaster Platform data stores the data link page + frequency analysis station is derived, the chain + root domain under the domain name number of links, anchor text + the number of anchor text, of course, since you can get to the analysis of the data can be more dimensions of the data, the script of the above three aspects of the analysis, if other needs can be extended.

data1 = json.loads (html1) # converted to JSON format

code and instructions posted below:


range_page_num_list = range (1, range_page_num) # such as calculation after range_page_num=4, range_page_num_list = [1,2,3], 100 per page chain is 3 pages

Leave a Reply

Your email address will not be published. Required fields are marked *