网站做多个语言有什么好处,企业网站推广的首选办法是,适合网站设计的gif图片,wordpress讨论区Datasets 提供两种数据集对象#xff1a;Dataset 和 ✨ IterableDataset ✨。
Dataset 提供快速随机访问数据集中的行#xff0c;并支持内存映射#xff0c;因此即使加载大型数据集也只需较少的内存。IterableDataset 适用于超大数据集#xff0c;甚至无法完全下载到磁盘或…Datasets 提供两种数据集对象Dataset 和 ✨ IterableDataset ✨。
Dataset 提供快速随机访问数据集中的行并支持内存映射因此即使加载大型数据集也只需较少的内存。IterableDataset 适用于超大数据集甚至无法完全下载到磁盘或内存中。它允许在数据集完全下载之前就开始访问和使用数据集。
0 读取数据
from datasets import load_datasetdataset load_dataset(rotten_tomatoes, splittrain)
datasetDataset({features: [text, label],num_rows: 8530
})1 Dataset
1.1 索引
dataset[0]{text: the rock is destined to be the 21st century\s new conan and that he\s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal .,label: 1}
dataset[-1]{text: things really get weird , though not particularly scary : the movie is all portent and no content .,label: 0}
dataset[0][text]the rock is destined to be the 21st century\s new conan and that he\s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal .
dataset[text] 1.2 切片
dataset[:3]{text: [the rock is destined to be the 21st century\s new conan and that he\s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal .,the gorgeously elaborate continuation of the lord of the rings trilogy is so huge that a column of words cannot adequately describe co-writer/director peter jackson\s expanded vision of j . r . r . tolkien\s middle-earth .,effective but too-tepid biopic],label: [1, 1, 1]}2 IterableDataset
当设置 streamingTrue 时加载的数据集为 IterableDataset
IterableDataset 的行为与 Dataset 不同
无法随机访问。只能逐个迭代获取元素例如使用 next(iter()) 或 for 循环。
from datasets import load_datasetiter_dataset load_dataset(rotten_tomatoes, splittrain,streamingTrue)
iter_datasetIterableDataset({features: [text, label],n_shards: 1
})for i in iter_dataset:print(i)break{text: the rock is destined to be the 21st century\s new conan and that he\s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal ., label: 1}2.1 从现有 Dataset 创建 IterableDataset
iter_dataset2dataset.to_iterable_dataset()
for i in iter_dataset2:print(i)break{text: the rock is destined to be the 21st century\s new conan and that he\s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal ., label: 1}2.2 获取指定数量的示例
list(iter_dataset2.take(3))[{text: the rock is destined to be the 21st century\s new conan and that he\s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal .,label: 1},{text: the gorgeously elaborate continuation of the lord of the rings trilogy is so huge that a column of words cannot adequately describe co-writer/director peter jackson\s expanded vision of j . r . r . tolkien\s middle-earth .,label: 1},{text: effective but too-tepid biopic, label: 1}]