from torch.utils.data import DataLoader DataLoader類

阿新 • • 發佈：2018-11-10

from torch.utils.data import DataLoader
dataloader = DataLoader(sample,batch_size=5,shuffle=True,num_workers=2) # 例項化

引數dataset是一個數據集(這一點個人認為描述的很大)

batch_size預設是1,是一次性讀取多少張圖片,下面中稱呼為取樣器個數

shuffle預設是false不打亂順序

sampler定義從資料集繪製樣本的策略。如果指定了相應的策略那麼shuffle必須是false

batch_sampler定義了一次性從資料集裡面拿出來的資料,與 batch_size, shuffle, sampler, and drop_last是互斥的

num_worker 多少個子執行緒用於載入資料,預設是0,表示只在主執行緒載入資料

timeout一定是非負的數值

drop_last這個引數決定是否保留餘數作為一個batch.舉例:有圖片13張,batch_size=4,那麼整除得3餘1,如果該引數值為False那麼總共batch=4,如果為true那麼總共的batch=3.

其他引數就不解釋了

呼叫len()函式會直接使用裡面的魔法方法,得到的是總的Batch數目

class DataLoader(object):


    __initialized = False

    def __init__(self, dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None,
                 num_workers=0, collate_fn=default_collate, pin_memory=False, drop_last=False,
                 timeout=0, worker_init_fn=None):
    r"""
    Data loader. Combines a dataset and a sampler, and provides
    single- or multi-process iterators over the dataset.

    Arguments:
        dataset (Dataset): dataset from which to load the data.
        batch_size (int, optional): how many samples per batch to load
            (default: 1).
        shuffle (bool, optional): set to ``True`` to have the data reshuffled
            at every epoch (default: False).
        sampler (Sampler, optional): defines the strategy to draw samples from
            the dataset. If specified, ``shuffle`` must be False.
        batch_sampler (Sampler, optional): like sampler, but returns a batch of
            indices at a time. Mutually exclusive with batch_size, shuffle,
            sampler, and drop_last.
        num_workers (int, optional): how many subprocesses to use for data
            loading. 0 means that the data will be loaded in the main process.
            (default: 0)
        collate_fn (callable, optional): merges a list of samples to form a mini-batch.
        pin_memory (bool, optional): If ``True``, the data loader will copy tensors
            into CUDA pinned memory before returning them.
        drop_last (bool, optional): set to ``True`` to drop the last incomplete batch,
            if the dataset size is not divisible by the batch size. If ``False`` and
            the size of dataset is not divisible by the batch size, then the last batch
            will be smaller. (default: False)
        timeout (numeric, optional): if positive, the timeout value for collecting a batch
            from workers. Should always be non-negative. (default: 0)
        worker_init_fn (callable, optional): If not None, this will be called on each
            worker subprocess with the worker id (an int in ``[0, num_workers - 1]``) as
            input, after seeding and before data loading. (default: None)

    .. note:: By default, each worker will have its PyTorch seed set to
              ``base_seed + worker_id``, where ``base_seed`` is a long generated
              by main process using its RNG. However, seeds for other libraies
              may be duplicated upon initializing workers (w.g., NumPy), causing
              each worker to return identical random numbers. (See
              :ref:`dataloader-workers-random-seed` section in FAQ.) You may
              use ``torch.initial_seed()`` to access the PyTorch seed for each
              worker in :attr:`worker_init_fn`, and use it to set other seeds
              before data loading.

    .. warning:: If ``spawn`` start method is used, :attr:`worker_init_fn` cannot be an
                 unpicklable object, e.g., a lambda function.
    """
        self.dataset = dataset
        self.batch_size = batch_size
        self.num_workers = num_workers
        self.collate_fn = collate_fn
        self.pin_memory = pin_memory
        self.drop_last = drop_last
        self.timeout = timeout
        self.worker_init_fn = worker_init_fn

        if timeout < 0:
            raise ValueError('timeout option should be non-negative')

        if batch_sampler is not None:
            if batch_size > 1 or shuffle or sampler is not None or drop_last:
                raise ValueError('batch_sampler option is mutually exclusive '
                                 'with batch_size, shuffle, sampler, and '
                                 'drop_last')
            self.batch_size = None
            self.drop_last = None

        if sampler is not None and shuffle:
            raise ValueError('sampler option is mutually exclusive with '
                             'shuffle')

        if self.num_workers < 0:
            raise ValueError('num_workers option cannot be negative; '
                             'use num_workers=0 to disable multiprocessing.')

        if batch_sampler is None:
            if sampler is None:
                if shuffle:
                    sampler = RandomSampler(dataset)
                else:
                    sampler = SequentialSampler(dataset)
            batch_sampler = BatchSampler(sampler, batch_size, drop_last)

        self.sampler = sampler
        self.batch_sampler = batch_sampler
        self.__initialized = True

    def __setattr__(self, attr, val):
        if self.__initialized and attr in ('batch_size', 'sampler', 'drop_last'):
            raise ValueError('{} attribute should not be set after {} is '
                             'initialized'.format(attr, self.__class__.__name__))

        super(DataLoader, self).__setattr__(attr, val)

    def __iter__(self):
        return _DataLoaderIter(self)

    def __len__(self):
        return len(self.batch_sampler)

from torch.utils.data import DataLoader DataLoader類

from torch.utils.data import DataLoader dataloader = DataLoader(sample,batch_size=5,shuffle=True,num_workers=2) # 例項化引數dataset是一個數據集(這一點個人認為

PyTorch原始碼解讀之torch.utils.data.DataLoader(轉)

原文連結 https://blog.csdn.net/u014380165/article/details/79058479 寫得特別好！最近正好在學習pytorch，學習一下！ PyTorch中資料讀取的一個重要介面是torch.utils.data.DataLoade

torch.utils.data.DataLoader函式

class DataLoader(object): r""" Data loader. Combines a dataset and a sampler, and provides single- or multi-process iterators

pytorch中的torch.utils.data.Dataset和torch.utils.data.DataLoader

首先看torch.utils.data.Dataset這個抽象類。可以使用這個抽象類來構造pytorch資料集。要注意的是以這個類構造的子類，一定要定義兩個函式一個是__len__，另一個是__getitem__，前者提供資料集size，而後者通過給定索引獲取資料和標籤。__

PyTorch原始碼解讀之torch.utils.data.DataLoader

PyTorch中資料讀取的一個重要介面是torch.utils.data.DataLoader，該介面定義在dataloader.py指令碼中，只要是用PyTorch來訓練模型基本都會用到該介面，該介面主要用來將自定義的資料讀取介面的輸出或者PyTorch已有的

pytorch的torch.utils.data.DataLoader認識

數據讀取作用數據定義 ORC tensor batch 一個讀取 PyTorch中數據讀取的一個重要接口是torch.utils.data.DataLoader，該接口定義在dataloader.py腳本中，只要是用PyTorch來訓練模型基本都會用到該接口，該

from torch._C import * ImportError: numpy.core.multiarray failed to import

安裝pyTorch時，按照官網的命令進行安裝 pip3 install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl pip3 install torchv

攜程函數、遞歸、二分法、import、from。。。import

。。 for 空間打印 name 打開文件 from else 流水線攜程函數與yield類似 yield: 1:把函數的執行結果封裝好__iter__和__next__，即得到一個叠代器 2：與return功能類似，都可以返回值，但不同的是，return只能返回一

Data轉Model的類型擦除問題

處理類型第三方 del clas 也會 mode 範型由於 style p.p1 { margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px ".PingFang SC"; color: #454545 } span.s1 { fon

論文閱讀 | CrystalBall: A Visual Analytic System for Future Event Discovery and Analysis from Social Media Data

夏洛特 bstr soci 相同方式 PE VM src 測量 CrystalBall: A Visual Analytic System for Future Event Discovery and Analysis from Social Media Data 論文地

“SYSTEM.DATA.SQLCLIENT.SQLCONNECTION”的類型初始值設定項引發異常---解決方案

conf mach med microsoft fault .sql def nec -s “System.Data.SqlClient.SqlConnection”的類型初始值設定項引發異常問題出在了 .net 的C:\WINDOWS\Microsoft.NET\F

解決 from scipy._lib.decorator import decorator as _decorator出現ImportError: No module named decorator

問題描述： from scipy._lib.decorator import decorator as _decorator [email protected]:~/PycharmProjects/ANN/Density-Based-Clustering$ python LST

解決ERROR: Can't get master address from ZooKeeper; znode data == null

問題描述執行Hbase shell時報錯： hbase(main):006:0> list TABLE &

《Frustum PointNets for 3D Object Detection from RGB-D Data》論文及程式碼學習（二）程式碼部分

《Frustum PointNets for 3D Object Detection from RGB-D Data》論文及程式碼學習（二）程式碼部分文章目錄《Frustum PointNets for 3D Object Detection from RG

《Frustum PointNets for 3D Object Detection from RGB-D Data》論文及程式碼學習

《Frustum PointNets for 3D Object Detection from RGB-D Data》論文及程式碼學習《Frustum PointNets for 3D Object Detection from RGB-D Data》一文是Charles R.Qi

ERROR: Can't get master address from ZooKeeper; znode data == null Table Namespace Manager not ready

Hbase執行需要兩個程序：HMaster 和 HRegionServer 建議先jps 一下，確保兩個程序都已啟動。 1.HMaster啟動失敗，報錯如下： ERROR: Can't get master address from ZooK

學習筆記2018-10-26 讀論文A single algorithm to retrieve turbidity from remotely-sensed data in all coastal

TOPIC: A single algorithm to retrieve turbidity from remotely-sensed data in all coastal and estuarine waters from RSE WRITERS: A.I

Solr Data Import Handler 同步資料

實際工作中，我們的業務資料在資料倉庫(如mysql)中。我們需要把資料庫中的資料同步到solr中，才能更好地做全文檢索。這就需要DIH(Data Import Handler)來發揮作用。初體驗官方文件是最好的第一手資料，http://lucene.apache.org

Failed to load GpuProgram from binary shader data in 'XXXXXX'.的解決方法

在開發的過程中，執行專案出現了Failed to load GpuProgram from binary shader data in 'XXXXX'.的警告。經過排查發現是由於自己的資源在製作的時候採

Spring Data Jpa + Mysql實體類自動建立表時出現錯誤

實體類Param，設定表名為vbap3_sql_param，在執行之後，出現錯誤，錯誤的建表語句如下。（資料庫是用的Mysql） create table vbap3_sql_param (id bi

from torch.utils.data import DataLoader DataLoader類

相關推薦