当前位置： > 网站运营 > 建站经验 > 文章内容

python怎么访问网页(用python访问网站)

http://www.itjxue.com 2023-03-25 11:44 来源:未知 点击次数:

20.Python使用Requests请求网页

利用 pip 安装

运行结果

结果

1.requests默认自带的Accept-Encoding导致或者新浪默认发送的就是压缩之后的网页

2.但是为什么content.read()没有问题，因为requests，自带解压压缩网页的功能

3.当收到一个响应时，Requests 会猜测响应的编码方式，用于在你调用response.text 方法时对响应进行解码。Requests 首先在 HTTP 头部检测是否存在指定的编码方式，如果不存在，则会使用 chardet.detect来尝试猜测编码方式（存在误差）

4.更推荐使用response.content.deocde()

python怎么访问网页(用python访问网站)

如何用python访问网页并在表单处输入内容

我用过selenium模拟浏览器

使用selenium的chrome或firefox的webdriver打开浏览器

driver.get(url) #访问你的网页from=driver.find_elements_by_xpath("xxx")通过xpath或id等方法锁定到网页上表单的那个元素后，用

from.send_keys("xxx")来输入内容

python 多线程访问网站

#python2

#coding=utf-8

import?os,re,requests,sys,time,threading

reload(sys)

sys.setdefaultencoding('utf-8')

class?Archives(object):

????def?__init__(self,?url):

????????self.url?=?url

????

????def?save_html(self,?text):

????????fn?=?'{}_{}'.format(int(time.time()),?self.url.split('/')[-1])

????????dirname?=?'htmls'

????????if?not?os.path.exists(dirname):

????????????os.mkdir(dirname)

????????with?open(os.path.join(dirname,?fn),?'w')?as?f:

????????????f.write(text)

????????????

????def?get_htmls(self):

????????try:??????????????

????????????r?=??requests.get(self.url)

????????????r.raise_for_status()

????????????r.encoding?=?r.apparent_encoding

????????????print?'get?html?from?',?url

????????????self.save_html(r.text)

????????except?Exception,e:

????????????print?'爬取失败',e????????????

????def?main(self):

????????thread?=?threading.Thread(target=self.get_htmls())

????????thread.start()

????????thread.join()

if?__name__=='__main__':

????start=time.time()

????fn?=?sys.argv[1]?if?len(sys.argv)1?else?'urls.txt'

????with?open(fn)?as?f:

????????s?=?f.readlines()

????for?url?in?set(s):

????????a=Archives(url.strip())

????????a.main()????

????end=time.time()

????print?end-start

如何用python访问自己编写的网页

html

body

form

可获取码列表：

select name="liscode"

option value="01"123456/option

option value="02"123457/option

option value="03"123458/option

option value="04"123459/option

option value="05"123460/option

option value="06"123461/option

/select

input type="submit" value="确认获取"/

/form

/body

/html

其中所有liscode是从一个txt文档上提取的，当用户点击获取一个的时候，该项即被删除。

如何用python实现呢？

做一个py脚本或exe给用户实现的话大概像下面这样：

Python code

infile = open('codelist.txt','r') codelist = infile.readlines() used_code = codelist[0] #remove用掉的code（删除行） codelist.remove(codelist[0]) infile.close() #重写文件（我不知道是否有能直接删除一行的文件操作方法） outfile = open('codelist.txt','w') for code in codelist: outfile.write(code + '\n') outfile.close() print used_code

如何访问需要登陆的网页 python

可以尝试添加相关cookie来试着去访问。自己先本地登录一下，然后抓取页面cookie，然后构造相应的请求，让他看起来像是登录过的，如果网站验证的不是特别严的话，是可以成功的。

还有一种方法，就是用Selenium框架，他会打开一个浏览器，然后访问指定url。但是还是免不了需要登录一次，但是登录过程，输入账号密码和点击登录按钮都是你可以模拟的。具体你可以参照官方文档。

python 实现在已打开的页面操作

1、要重写Remote类，防止session重建，如下：

class ReuseChrome(Remote):

? ? def __init__(self, command_executor, session_id):

? ? ? ? self.r_session_id= session_id

? ? ? ? Remote.__init__(self, command_executor=command_executor, desired_capabilities={})

? ? def start_session(self, capabilities, browser_profile=None):

? ? ? ? if not isinstance(capabilities, dict):

? ? ? ? ? ? raise InvalidArgumentException("Capabilities must be a dictionary")

? ? ? ? if browser_profile:

? ? ? ? ? ? if "moz:firefoxOptions" in capabilities:

? ? ? ? ? ? ? ? capabilities["moz:firefoxOptions"]["profile"] = browser_profile.encoded

else:

? ? ? ? ? ? ? ? capabilities.update({'firefox_profile': browser_profile.encoded})

? ? ? ? self.capabilities= options.Options().to_capabilities()

? ? ? ? self.session_id= self.r_session_id

self.w3c= False

2、访问已打开页面方式：

dr= ReuseChrome(command_executor=old_curl, session_id=sessionid)

其中old_curl需在原来打开的页面上获取：

old_curl=dr.command_executor._url? ? #一定要用这个方法，获取当前地址是行不通的！

sessionid=dr.session_id

(责任编辑：IT教学网)

复制链接发给好友收藏本文关闭此页

上一篇：投行是什么意思干什么(投行指什么)

下一篇：成人学python培训机构(python培训班学费多少)

python怎么访问网页(用python访问网站)

20.Python使用Requests请求网页

如何用python访问网页并在表单处输入内容

python 多线程访问网站

如何用python访问自己编写的网页

如何访问需要登陆的网页 python

python 实现在已打开的页面操作

(责任编辑：IT教学网)

相关建站经验文章

阅读排行

专题教程

推荐建站经验文章

最新更新建站经验

python怎么访问网页(用python访问网站)

20.Python使用Requests请求网页

如何用python访问网页并在表单处输入内容

python 多线程 访问网站

如何用python访问自己编写的网页

如何访问需要登陆的网页 python

python 实现 在已打开的页面操作

(责任编辑：IT教学网)

相关建站经验文章

阅读排行

专题教程

推荐建站经验文章

最新更新建站经验

python 多线程访问网站

python 实现在已打开的页面操作