当前位置: 首页 > news >正文

网站平台建设重点难点分析做网站公司排名是什么

网站平台建设重点难点分析,做网站公司排名是什么,wordpress批量替换,做网站对服务器要求一、说明 文本摘要是为较长的文本文档生成简短、流畅且最重要的是准确摘要的过程。自动文本摘要背后的主要思想是能够从整个集合中找到最重要信息的一小部分#xff0c;并以人类可读的格式呈现。随着在线文本数据的增长#xff0c;自动文本摘要方法可能会非常有用#xff0c… 一、说明         文本摘要是为较长的文本文档生成简短、流畅且最重要的是准确摘要的过程。自动文本摘要背后的主要思想是能够从整个集合中找到最重要信息的一小部分并以人类可读的格式呈现。随着在线文本数据的增长自动文本摘要方法可能会非常有用因为可以在短时间内有用的信息。 二、为什么要自动文本摘要 摘要减少了阅读时间。研究文档时摘要使选择过程变得更加容易。自动摘要提高了索引的有效性。自动摘要算法比人工摘要的偏差更小。个性化摘要在问答系统中非常有用因为它们提供个性化信息。使用自动或半自动摘要系统使商业摘要服务能够增加其能够处理的文本文档的数量。 三、文本总结的依据  在下图至少出现了三个环节1文档归类  2文档目的归类 3主题信息抽取。 3.1 基于输入类型 Single Document 输入长度较短。许多早期的摘要系统处理单文档摘要。多文档输入可以任意长。 3.2 根据目的的归类 通用模型不对要总结的文本的领域或内容做出任何假设并将所有输入视为同类。已完成的大部分工作都是围绕通用摘要展开的。特定领域模型使用特定领域的知识来形成更准确的摘要。例如总结特定领域的研究论文、生物医学文献等。基于查询其中摘要仅包含回答有关输入文本的自然语言问题的信息。 3.3 根据输出类型 提取从输入文本中选择重要的句子以形成摘要。当今大多数总结方法本质上都是提取性的。抽象模型形成自己的短语和句子以提供更连贯的摘要就像人类会生成的一样。这种方法肯定更有吸引力但比提取摘要困难得多。 四、如何进行文本摘要 文字清理句子标记化单词标记化词频表总结 4.1 文字清理 # !pip instlla -U spacy # !python -m spacy download en_core_web_sm import spacy from spacy.lang.en.stop_words import STOP_WORDS from string import punctuation stopwords list(STOP_WORDS) nlp spacy.load(‘en_core_web_sm’) doc nlp(text) 4.2 单词标记化 tokens [token.text for token in doc] print(tokens) punctuation punctuation ‘\n’ punctuation word_frequencies {} for word in doc: if word.text.lower() not in stopwords: if word.text.lower() not in punctuation: if word.text not in word_frequencies.keys(): word_frequencies[word.text] 1 else: word_frequencies[word.text] 1 print(word_frequencies) 4.3 句子标记化 max_frequency max(word_frequencies.values()) max_frequency for word in word_frequencies.keys(): word_frequencies[word] word_frequencies[word]/max_frequency print(word_frequencies) sentence_tokens [sent for sent in doc.sents] print(sentence_tokens) 4.4 建立词频表 sentence_scores {} for sent in sentence_tokens: for word in sent: if word.text.lower() in word_frequencies.keys(): if sent not in sentence_scores.keys(): sentence_scores[sent] word_frequencies[word.text.lower()] else: sentence_scores[sent] word_frequencies[word.text.lower()] sentence_scores 4.5 主题信息总结 from heapq import nlargest select_length int(len(sentence_tokens)*0.3) select_length summary nlargest(select_length, sentence_scores, key sentence_scores.get) summary final_summary [word.text for word in summary] summary ‘ ‘.join(final_summary) 输入原始文档 text “”” Maria Sharapova has basically no friends as tennis players on the WTA Tour. The Russian player has no problems in openly speaking about it and in a recent interview she said: ‘I don’t really hide any feelings too much. I think everyone knows this is my job here. When I’m on the courts or when I’m on the court playing, I’m a competitor and I want to beat every single person whether they’re in the locker room or across the net. So I’m not the one to strike up a conversation about the weather and know that in the next few minutes I have to go and try to win a tennis match. I’m a pretty competitive girl. I say my hellos, but I’m not sending any players flowers as well. Uhm, I’m not really friendly or close to many players. I have not a lot of friends away from the courts.’ When she said she is not really close to a lot of players, is that something strategic that she is doing? Is it different on the men’s tour than the women’s tour? ‘No, not at all. I think just because you’re in the same sport doesn’t mean that you have to be friends with everyone just because you’re categorized, you’re a tennis player, so you’re going to get along with tennis players. I think every person has different interests. I have friends that have completely different jobs and interests, and I’ve met them in very different parts of my life. I think everyone just thinks because we’re tennis players we should be the greatest of friends. But ultimately tennis is just a very small part of what we do. There are so many other things that we’re interested in, that we do.’ “”” 4.6 输出最终摘要摘要 I think just because you’re in the same sport doesn’t mean that you have to be friends with everyone just because you’re categorized, you’re a tennis player, so you’re going to get along with tennis players. Maria Sharapova has basically no friends as tennis players on the WTA Tour. I have friends that have completely different jobs and interests, and I’ve met them in very different parts of my life. I think everyone just thinks because we’re tennis players So I’m not the one to strike up a conversation about the weather and know that in the next few minutes I have to go and try to win a tennis match. When she said she is not really close to a lot of players, is that something strategic that she is doing? 有关完整代码请查看我的存储库 五、结语 本文至少精简地告诉大家文章自动摘要需要哪些关键环节。 创建数据集可能是一项繁重的工作并且经常是学习数据科学中被忽视的部分实际工作要给以重视。不过这是另一篇博客文章。阿努普·辛格
http://www.w-s-a.com/news/447869/

相关文章:

  • 深圳自适应网站建设价格广东网站建设软件
  • 网页设计介绍北京网站自己做彩票网站
  • 最牛论坛网站app生成链接
  • 用jsp做的网站源代码网站优化说明
  • 网站建设公司名字甘肃省和住房建设厅网站
  • 做外贸网站需要什么卡网站建设公司怎样
  • 网站关键词密度怎么计算的中文版wordpress
  • asp网站建设教程如何在线上推广自己的产品
  • 电脑网站你懂我意思正能量济南网站建设公司熊掌号
  • 杂志社网站建设萧山区网站建设
  • 电商网站前端制作分工网站怎做百度代码统计
  • 免费的html大作业网站网站开发心得500字
  • 临时工找工作网站做美缝帮别人做非法网站
  • 深圳网站建设 设计创公司新昌网站开发
  • 唐山教育平台网站建设上海装修网官网
  • 一个公司做多个网站什么行业愿意做网站
  • 成都龙泉建设网站免费域名app官方下载
  • xss网站怎么搭建如何用wordpress站群
  • 怎样做网站外链supercell账号注册网站
  • 阿里巴巴网站是用什么技术做的哪些网站做推广比较好
  • 做网站go和python手机如何创网站
  • 网站开发进修网站做301将重定向到新域名
  • 公司网站开发费用账务处理ucenter wordpress
  • 六站合一的优势少儿编程机构
  • 软件开发与网站开发学做美食网站哪个好
  • 网站搜索 收录优化百度推广页面投放
  • 响应式网站的优点浙江省网站域名备案
  • 网站安全 扫描深圳被点名批评
  • 在哪个网站可以一对一做汉教网站优化策略
  • 龙岩做网站的顺企网宁波网站建设