-
-
Notifications
You must be signed in to change notification settings - Fork 502
Description
When I trained the model on a single GPU with samples_per_gpu=2, error was reported as follows:
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop data = fetcher.fetch(index)
File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch return self.collate_fn(data) File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/mmcv/parallel/collate.py", line 79, in collate return { File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/mmcv/parallel/collate.py", line 80, in key: collate([d[key] for d in batch], samples_per_gpu)
File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/mmcv/parallel/collate.py", line 84, in collate return default_collate(batch) File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/torch/utils/data/_utils/collate.py", line 265, in default_collate return collate(batch, collate_fn_map=default_collate_fn_map)
File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/torch/utils/data/_utils/collate.py", line 119, in collate return collate_fn_map[elem_type](batch, collate_fn_map=collate_fn_map)
File "/home/user/anaconda3/envs/uniad2.0/lib/python3.9/site-packages/torch/utils/data/_utils/collate.py", line 162, in collate_tensor_fn return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [23] at entry 0 and [40] at entry 1
Does samples_per_gpu have to be equal to 1 when training on a single GPU? Is there any solution?