python - Unable to load resnet18 model from .pth file - Stack Overflow

admin2025-05-02  1

I trained a ResNet18 model; and saved it to a .pth file. When I try to load it I get this error, this continues for a couple more lines with the same pattern.

Error loading checkpoint: Error(s) in loading state_dict for ResNet:
size mismatch for layer1.0.conv1.weight: copying a param with shape torch.Size([64, 64, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer1.1.conv1.weight: copying a param with shape torch.Size([64, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([64, 64, 3, 3]).
size mismatch for layer2.0.conv1.weight: copying a param with shape torch.Size([128, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 3, 3]).
size mismatch for layer2.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 1, 1]).
size mismatch for layer2.0.downsample.1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([128]).

This is my code for training the original model:

    teacher = models.resnet18(pretrained=True)
    
    num_features = teacher.fc.in_features
    teacher.fc = nn.Linear(num_features, 5)
        
        
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.SGD(teacher.parameters(), lr=0.0001, momentum=0.9, weight_decay=0.0001)
    
    scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.1)

def train_and_evaluate(model, train_loader, val_loader, criterion, optimizer, num_epochs, lambda_l1, learning_rate):
    for epoch in range(num_epochs):
        model.train()
        for images, labels in train_loader:
            # Filter out class 2 samples during training
            mask = labels != 2
            images, labels = images[mask], labels[mask]

            if len(labels) == 0: 
                continue

            optimizer.zero_grad()
            outputs = model(images)
            loss = criterion(outputs, labels)
            total_loss = loss 
            total_loss.backward()
            optimizer.step()

        # Evaluate on both training and validation sets, excluding class 2
        train_loss, train_accuracy = evaluate_model(model, train_loader, criterion)
        val_loss, val_accuracy = evaluate_model(model, val_loader, criterion)

        print(f"Epoch {epoch+1} - Training Loss: {train_loss:.4f}, Training Accuracy: {train_accuracy:.4%}, Validation Loss: {val_loss:.4f}, Validation Accuracy: {val_accuracy:.4%}")

I know the logic for excluding the second class is a bit messy, but I really need to get this model back because training it took so long.

Also this is how I'm loading the model:

checkpoint = torch.load("model968acc.pth", map_location="cpu")
teacher.load_state_dict(checkpoint, strict=False)

If there is no hope for saving this, what other ways do you recommend saving the model so I won't have trouble loading it?

转载请注明原文地址:http://www.anycun.com/QandA/1746137196a92087.html