当前位置: 首页 > news >正文

深圳网站推广策划免费cms系统php

深圳网站推广策划,免费cms系统php,wordpress数字减1,艺术家网站源码一、说明 文本摘要是为较长的文本文档生成简短、流畅且最重要的是准确摘要的过程。自动文本摘要背后的主要思想是能够从整个集合中找到最重要信息的一小部分#xff0c;并以人类可读的格式呈现。随着在线文本数据的增长#xff0c;自动文本摘要方法可能会非常有用#xff0c… 一、说明         文本摘要是为较长的文本文档生成简短、流畅且最重要的是准确摘要的过程。自动文本摘要背后的主要思想是能够从整个集合中找到最重要信息的一小部分并以人类可读的格式呈现。随着在线文本数据的增长自动文本摘要方法可能会非常有用因为可以在短时间内有用的信息。 二、为什么要自动文本摘要 摘要减少了阅读时间。研究文档时摘要使选择过程变得更加容易。自动摘要提高了索引的有效性。自动摘要算法比人工摘要的偏差更小。个性化摘要在问答系统中非常有用因为它们提供个性化信息。使用自动或半自动摘要系统使商业摘要服务能够增加其能够处理的文本文档的数量。 三、文本总结的依据  在下图至少出现了三个环节1文档归类  2文档目的归类 3主题信息抽取。 3.1 基于输入类型 Single Document 输入长度较短。许多早期的摘要系统处理单文档摘要。多文档输入可以任意长。 3.2 根据目的的归类 通用模型不对要总结的文本的领域或内容做出任何假设并将所有输入视为同类。已完成的大部分工作都是围绕通用摘要展开的。特定领域模型使用特定领域的知识来形成更准确的摘要。例如总结特定领域的研究论文、生物医学文献等。基于查询其中摘要仅包含回答有关输入文本的自然语言问题的信息。 3.3 根据输出类型 提取从输入文本中选择重要的句子以形成摘要。当今大多数总结方法本质上都是提取性的。抽象模型形成自己的短语和句子以提供更连贯的摘要就像人类会生成的一样。这种方法肯定更有吸引力但比提取摘要困难得多。 四、如何进行文本摘要 文字清理句子标记化单词标记化词频表总结 4.1 文字清理 # !pip instlla -U spacy # !python -m spacy download en_core_web_sm import spacy from spacy.lang.en.stop_words import STOP_WORDS from string import punctuation stopwords list(STOP_WORDS) nlp spacy.load(‘en_core_web_sm’) doc nlp(text) 4.2 单词标记化 tokens [token.text for token in doc] print(tokens) punctuation punctuation ‘\n’ punctuation word_frequencies {} for word in doc: if word.text.lower() not in stopwords: if word.text.lower() not in punctuation: if word.text not in word_frequencies.keys(): word_frequencies[word.text] 1 else: word_frequencies[word.text] 1 print(word_frequencies) 4.3 句子标记化 max_frequency max(word_frequencies.values()) max_frequency for word in word_frequencies.keys(): word_frequencies[word] word_frequencies[word]/max_frequency print(word_frequencies) sentence_tokens [sent for sent in doc.sents] print(sentence_tokens) 4.4 建立词频表 sentence_scores {} for sent in sentence_tokens: for word in sent: if word.text.lower() in word_frequencies.keys(): if sent not in sentence_scores.keys(): sentence_scores[sent] word_frequencies[word.text.lower()] else: sentence_scores[sent] word_frequencies[word.text.lower()] sentence_scores 4.5 主题信息总结 from heapq import nlargest select_length int(len(sentence_tokens)*0.3) select_length summary nlargest(select_length, sentence_scores, key sentence_scores.get) summary final_summary [word.text for word in summary] summary ‘ ‘.join(final_summary) 输入原始文档 text “”” Maria Sharapova has basically no friends as tennis players on the WTA Tour. The Russian player has no problems in openly speaking about it and in a recent interview she said: ‘I don’t really hide any feelings too much. I think everyone knows this is my job here. When I’m on the courts or when I’m on the court playing, I’m a competitor and I want to beat every single person whether they’re in the locker room or across the net. So I’m not the one to strike up a conversation about the weather and know that in the next few minutes I have to go and try to win a tennis match. I’m a pretty competitive girl. I say my hellos, but I’m not sending any players flowers as well. Uhm, I’m not really friendly or close to many players. I have not a lot of friends away from the courts.’ When she said she is not really close to a lot of players, is that something strategic that she is doing? Is it different on the men’s tour than the women’s tour? ‘No, not at all. I think just because you’re in the same sport doesn’t mean that you have to be friends with everyone just because you’re categorized, you’re a tennis player, so you’re going to get along with tennis players. I think every person has different interests. I have friends that have completely different jobs and interests, and I’ve met them in very different parts of my life. I think everyone just thinks because we’re tennis players we should be the greatest of friends. But ultimately tennis is just a very small part of what we do. There are so many other things that we’re interested in, that we do.’ “”” 4.6 输出最终摘要摘要 I think just because you’re in the same sport doesn’t mean that you have to be friends with everyone just because you’re categorized, you’re a tennis player, so you’re going to get along with tennis players. Maria Sharapova has basically no friends as tennis players on the WTA Tour. I have friends that have completely different jobs and interests, and I’ve met them in very different parts of my life. I think everyone just thinks because we’re tennis players So I’m not the one to strike up a conversation about the weather and know that in the next few minutes I have to go and try to win a tennis match. When she said she is not really close to a lot of players, is that something strategic that she is doing? 有关完整代码请查看我的存储库 五、结语 本文至少精简地告诉大家文章自动摘要需要哪些关键环节。 创建数据集可能是一项繁重的工作并且经常是学习数据科学中被忽视的部分实际工作要给以重视。不过这是另一篇博客文章。阿努普·辛格
http://www.w-s-a.com/news/764773/

相关文章:

  • 网站建设深圳哪家好世界500强企业招聘网站
  • 如何减少网站建设中的错误温州网站公司哪家好
  • 宜章网站建设北京定制公交网站
  • 怎么让谷歌收录我的网站郑州网站建设更好
  • 在线视频网站开发方案phpaspnet网站开发实例视频
  • 正常做一个网站多少钱网站开发所遵循
  • 西部数码网站备份领英创建公司主页
  • 中山网站建设文化平台成都电商app开发
  • 无锡网站推广公司排名中国十大网站建设
  • 网站建设报价怎么差别那么大深圳开发公司网站建设
  • 京东商城网站建设方案书建设网站平台
  • 如何查询网站建设时间赤峰建网站的电话
  • 域名购买网站有哪些公司企业邮箱管理制度
  • 阿里云服务起做网站抖音seo推荐算法
  • 免费建站工具机械网站建设公司推荐
  • 怎么用自己主机做网站_如何做简单的网站
  • 阿里巴巴国际站跨境电商平台为什么有点网站打不开
  • 甘肃做网站哪家好网站开发 都包含什么语言
  • 合肥哪里有做网站的广告型网站怎么做的
  • 用dede做的网站国外免费空间哪个好
  • dede个人网站模板企点
  • 韩雪个人网站wordpress 怎么添加网站备案信息
  • 个人网站可以做地方技能培训班
  • 品牌营销策略研究无锡 网站 seo 优化
  • 在线推广网站的方法有哪些织梦网站首页目录在哪
  • 做爰全过程免费网站的视频做网站的几个步骤
  • cpa建站教程青海西宁制作网站企业
  • 简易的在线数据库网站模板网站多服务器建设
  • 成都seo网站建设花店网页模板html
  • 义乌市网站制作网络营销策略名词解释