Skip to content

nn.dataparallel has issue for mac (mps device) #2601

Open
@rnb007

Description

@rnb007

This is the error i get when I use the below function

def try_all_gpus(): #@save
"""Return all available GPUs, or [cpu(),] if no GPU exists."""
devices = [torch.device(f'cuda:{i}')
for i in range(torch.cuda.device_count())]
return devices if devices else [torch.device('cpu')]

trainer = torch.optim.Adam(net.parameters(), lr=lr)
3 loss = nn.CrossEntropyLoss(reduction="none")
----> 4 d2l.train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs)

File ~/anaconda3/envs/dl_env/lib/python3.10/site-packages/d2l/torch.py:1507, in train_ch13(net, train_iter, test_iter, loss, trainer, num_epochs, devices)
1504 timer, num_batches = d2l.Timer(), len(train_iter)
1505 animator = d2l.Animator(xlabel='epoch', xlim=[1, num_epochs], ylim=[0, 1],
1506 legend=['train loss', 'train acc', 'test acc'])
-> 1507 net = nn.DataParallel(net, device_ids=devices).to(devices[0])
1508 for epoch in range(num_epochs):
1509 # Sum of training loss, sum of training accuracy, no. of examples,
1510 # no. of predictions
1511 metric = d2l.Accumulator(4)

IndexError: list index out of range

This happens as mac does not have cuda support.

by tweaking the above function by just changing cpu to mps, kernel always dies
def try_all_gpus(): #@save
"""Return all available GPUs, or [cpu(),] if no GPU exists."""
devices = [torch.device(f'cuda:{i}')
for i in range(torch.cuda.device_count())]
return devices if devices else [torch.device('mps')]

How can i run the nn.dataparallel or for that matter chapter 16 d2l.train_ch13 function from 16.2 section ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions