手机
当前位置:查字典教程网 >脚本专栏 >python >python抓取京东商城手机列表url实例代码
python抓取京东商城手机列表url实例代码
摘要:复制代码代码如下:#-*-coding:UTF-8-*-'''Createdon2013-12-5@author:good-temper''...

复制代码 代码如下:

#-*- coding: UTF-8 -*-

'''

Created on 2013-12-5

@author: good-temper

'''

import urllib2

import bs4

import time

def getPage(urlStr):

'''

获取页面内容

'''

content = urllib2.urlopen(urlStr).read()

return content

def getNextPageUrl(currPageNum):

#http://list.jd.com/9987-653-655-0-0-0-0-0-0-0-1-1-页码-1-1-72-4137-33.html

url = u'http://list.jd.com/9987-653-655-0-0-0-0-0-0-0-1-1-'+str(currPageNum+1)+'-1-1-72-4137-33.html'

#是否有下一页

content = getPage(url);

soup = bs4.BeautifulSoup(content)

list = soup.findAll('span',{'class':'next-disabled'});

if(len(list) == 0):

return url

return ''

def analyzeList():

pageNum = 0

list = []

url = getNextPageUrl(pageNum)

while url !='':

soup = bs4.BeautifulSoup(getPage(url))

pagelist = soup.findAll('div',{'class':'p-name'})

for elem in pagelist:

soup1 = bs4.BeautifulSoup(str(elem))

list.append(soup1.find('a')['href'])

pageNum = pageNum+1

print pageNum

url = getNextPageUrl(pageNum)

return list

def analyzeContent(url):

return ''

def writeToFile(list, path):

f = open(path, 'a')

for elem in list:

f.write(elem+'n')

f.close()

if __name__ == '__main__':

list = analyzeList()

print '共抓取'+str(len(list))+'条n'

writeToFile(list, u'E:jd_phone_list.dat');

【python抓取京东商城手机列表url实例代码】相关文章:

python冒泡排序算法的实现代码

python 中文字符串的处理实现代码

python cookielib 登录人人网的实现代码

Python 随机生成中文验证码的实例代码

python 布尔操作实现代码

python类型强制转换long to int的代码

python二分法实现实例

python选择排序算法的实现代码

python 生成不重复的随机数的代码

python列表与元组详解实例

精品推荐
分类导航