当前位置：首页 > news >正文

网站托管平台什么语言做网站快

news 2026/4/8 16:10:41

网站托管平台,什么语言做网站快,响应页手机网站源码,自己买服务器建网站本项目纯学习使用。 1 scrapy 代码爬取逻辑非常简单#xff0c;根据url来处理翻页#xff0c;然后获取到详情页面的链接#xff0c;再去爬取详情页面的内容即可#xff0c;最终数据落地到excel中。经测试#xff0c;总计获取 11299条中医药材数据。 import pandas as…本项目纯学习使用。 1 scrapy 代码爬取逻辑非常简单根据url来处理翻页然后获取到详情页面的链接再去爬取详情页面的内容即可最终数据落地到excel中。经测试总计获取 11299条中医药材数据。 import pandas as pd import scrapyclass ZhongyaoSpider(scrapy.Spider):name zhongyaostart_urls [fhttps://www.zysj.com.cn/zhongyaocai/index__{i}.html for i in range(1, 27)]def __init__(self, *args, **kwargs):self.data []def parse(self, response):for li in response.css(div#list-content ul li):a_tag li.css(a)title a_tag.css(::attr(title)).get()href a_tag.css(::attr(href)).get()if title and href:# 构建完整的详情页 URLdetail_url response.urljoin(href)yield scrapy.Request(detail_url, callbackself.parse_detail, meta{title: title})# 解析逻辑def parse_detail(self, response):title response.meta[title]pinyin response.css(div.item.pinyin_name_phonetic div.item-content::text).get(default).strip()alias response.css(div.item.alias div.item-content p::text).get(default).strip()english_name response.css(div.item.english_name div.item-content::text).get(default).strip()source response.css(div.item.alias div.item-content p::text).get(default).strip()# 性味flavor response.css(div.item.flavor div.item-content p::text).get(default).strip()functional_indications response.css(div.item.flavor div.item-content p::text).get(default).strip()usage response.css(div.item.usage div.item-content p::text).get(default).strip()excerpt response.css(div.item.excerpt div.item-content::text).get(default).strip()#habitat response.css(div.item.habitat div.item-content p::text).get(default).strip()# 出处provenance response.css(div.item.provenance div.item-content p::text).get(default).strip()# 性状shape_properties response.css(div.item.shape_properties div.item-content p::text).get(default).strip()# 归经attribution response.css(div.item.attribution div.item-content p::text).get(default).strip()# 原形态prototype response.css(div.item.prototype div.item-content p::text).get(default).strip()# 名家论述discuss response.css(div.item.discuss div.item-content p::text).get(default).strip()# 化学成分chemical_composition response.css(div.item.chemical_composition div.item-content p::text).get(default).strip()item {title: title,pinyin: pinyin,alias: alias,source: source,english_name: english_name,habitat: habitat,flavor: flavor,functional_indications: functional_indications,usage: usage,excerpt: excerpt,provenance: provenance,shape_properties: shape_properties,attribution: attribution,prototype: prototype,discuss: discuss,chemical_composition: chemical_composition,}self.data.append(item)yield itemdef closed(self, reason):# 当爬虫关闭时保存数据到 Excel 文件df pd.DataFrame(self.data)df.to_excel(zhongyao_data.xlsx, indexFalse)2 爬取截图 3 爬取数据截图

查看全文

http://www.w-s-a.com/news/565391/