Discussion:
[PATCH] mptfusion: enable no_write_same in scsi_host_template
Chris J Arges
2014-09-22 17:56:59 UTC
Permalink
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver which ensures that manual zeroing out
is used instead.

BugLink: http://bugs.launchpad.net/bugs/1371591
Reported-by: Bruce Lucas <***@mongodb.com>
Signed-off-by: Chris J Arges <***@canonical.com>
---
drivers/message/fusion/mptspi.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 787933d..2fcc9bd 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -850,6 +850,7 @@ static struct scsi_host_template mptspi_driver_template = {
.cmd_per_lun = 7,
.use_clustering = ENABLE_CLUSTERING,
.shost_attrs = mptscsih_host_attrs,
+ .no_write_same = 1,
};

static int mptspi_write_spi_device_pg1(struct scsi_target *starget,
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Hellwig
2014-09-22 18:02:20 UTC
Permalink
Post by Chris J Arges
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver which ensures that manual zeroing out
is used instead.
Does this affet real hardware or is it a VMware bug? If it's just the
latter we should simply blacklist VMware.
Chris J Arges
2014-09-22 18:17:01 UTC
Permalink
Post by Christoph Hellwig
Post by Chris J Arges
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver which ensures that manual zeroing out
is used instead.
Does this affet real hardware or is it a VMware bug? If it's just the
latter we should simply blacklist VMware.
I've only been able to reproduce this on VMWare. There is a pretty
straightforward reproducer in the BugLink if there is any interest in
verifying on hardware.

How would you recommending blacklisting only VMWare guests in this case?

Thanks,
--chris j arges
Christoph Hellwig
2014-09-22 18:19:29 UTC
Permalink
Post by Chris J Arges
I've only been able to reproduce this on VMWare. There is a pretty
straightforward reproducer in the BugLink if there is any interest in
verifying on hardware.
How would you recommending blacklisting only VMWare guests in this case?
Can you check what PCI subdevice and subvendor IDs the device you can
reproduce it with have? If the subvendor is Vmware that would be easy,
if not the Avago people might be able to help with a device specific
VMware identification. If that fails we have kernel helpers to
identify the hypervisor, but I'd rather avoid that as it would also
trigger for PCI pass through devices.
Chris J Arges
2014-09-22 18:50:08 UTC
Permalink
Post by Christoph Hellwig
Post by Chris J Arges
I've only been able to reproduce this on VMWare. There is a pretty
straightforward reproducer in the BugLink if there is any interest in
verifying on hardware.
How would you recommending blacklisting only VMWare guests in this case?
Can you check what PCI subdevice and subvendor IDs the device you can
reproduce it with have? If the subvendor is Vmware that would be easy,
if not the Avago people might be able to help with a device specific
VMware identification. If that fails we have kernel helpers to
identify the hypervisor, but I'd rather avoid that as it would also
trigger for PCI pass through devices.
Christoph,

Thanks for the input, and now I realize how broad the earlier patch was.
I'll see if I can quirk on the vendor ID to enable no_write_same, in the
meantime I'll see if there are any ways to fix the underlying issue
without adding such a quirk.

Thanks,
--chris j arges
Chris J Arges
2014-09-22 20:44:52 UTC
Permalink
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver and the vmware subsystem_vendor which
ensures that manual zeroing out is used instead.

BugLink: http://bugs.launchpad.net/bugs/1371591
Reported-by: Bruce Lucas <***@mongodb.com>
Tested-by: Chris J Arges <***@canonical.com>
Signed-off-by: Chris J Arges <***@canonical.com>
---
drivers/message/fusion/mptspi.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 787933d..fca1594 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1409,6 +1409,12 @@ mptspi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
return 0;
}

+ /* Fix for vmware guests that do not implement write_same
+ */
+ if (pdev->subsystem_vendor == 0x15AD) {
+ mptspi_driver_template.no_write_same = 1;
+ }
+
sh = scsi_host_alloc(&mptspi_driver_template, sizeof(MPT_SCSI_HOST));

if (!sh) {
--
1.9.1
Chris J Arges
2014-09-22 20:54:51 UTC
Permalink
Post by Chris J Arges
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver and the vmware subsystem_vendor which
ensures that manual zeroing out is used instead.
BugLink: http://bugs.launchpad.net/bugs/1371591
---
drivers/message/fusion/mptspi.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 787933d..fca1594 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1409,6 +1409,12 @@ mptspi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
return 0;
}
+ /* Fix for vmware guests that do not implement write_same
+ */
+ if (pdev->subsystem_vendor == 0x15AD) {
+ mptspi_driver_template.no_write_same = 1;
+ }
+
sh = scsi_host_alloc(&mptspi_driver_template, sizeof(MPT_SCSI_HOST));
if (!sh) {
This is a do-no-harm patch, with a case to check for vmware
subsystem_vendor as suggested by Christoph. Let me know if there are
issues with this patch (do I need a define for the vendor ID? Is the
comment a little too vague or unnecessary?)

Adding those that were CC'ed in the other part of the thread.

Thanks,
--chris j arges
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Christoph Hellwig
2014-09-23 07:11:54 UTC
Permalink
Hi Chris,

thanks for updating it, althugh it would need a few more updates.
Post by Chris J Arges
+ /* Fix for vmware guests that do not implement write_same
+ */
+ if (pdev->subsystem_vendor == 0x15AD) {
+ mptspi_driver_template.no_write_same = 1;
+ }
+
We should set it only on th host that matches, not the whole template
for this case. The host is allocated just below your statement in
the same function, so this should be easy.

Also no need for braces here, and try to follow the Linux comment style:

/* The VMWare emulation doesn't properly impement WRITE SAME */
if (pdev->subsystem_vendor == 0x15AD)
sh->no_write_same = 1;


(and yes, it would be good to have PCI_VENDOR_ID_VMWARE in pci_ids.h,
but that shouldn't be done in this bug fix patch)
Chris J Arges
2014-09-23 14:22:25 UTC
Permalink
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver and the vmware subsystem_vendor which
ensures that manual zeroing out is used instead.

BugLink: http://bugs.launchpad.net/bugs/1371591
Reported-by: Bruce Lucas <***@mongodb.com>
Tested-by: Chris J Arges <***@canonical.com>
Signed-off-by: Chris J Arges <***@canonical.com>
---
drivers/message/fusion/mptspi.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 787933d..613231c 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1419,6 +1419,11 @@ mptspi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto out_mptspi_probe;
}

+ /* VMWare emulation doesn't properly implement WRITE_SAME
+ */
+ if (pdev->subsystem_vendor == 0x15AD)
+ sh->no_write_same = 1;
+
spin_lock_irqsave(&ioc->FreeQlock, flags);

/* Attach the SCSI Host to the IOC structure
--
1.9.1
Chris J Arges
2014-09-23 15:29:30 UTC
Permalink
Post by Chris J Arges
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver and the vmware subsystem_vendor which
ensures that manual zeroing out is used instead.
Please hold off on applying this, I'm pursuing a different approach to
solving this bug that may be more optimal than adding this quirk.
Thanks,
--chris j arges
Post by Chris J Arges
BugLink: http://bugs.launchpad.net/bugs/1371591
---
drivers/message/fusion/mptspi.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 787933d..613231c 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1419,6 +1419,11 @@ mptspi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto out_mptspi_probe;
}
+ /* VMWare emulation doesn't properly implement WRITE_SAME
+ */
+ if (pdev->subsystem_vendor == 0x15AD)
+ sh->no_write_same = 1;
+
spin_lock_irqsave(&ioc->FreeQlock, flags);
/* Attach the SCSI Host to the IOC structure
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris J Arges
2014-09-23 22:11:44 UTC
Permalink
Post by Chris J Arges
When using a virtual SCSI disk in a VMWare VM if blkdev_issue_zeroout is used
data can be improperly zeroed out using the mptfusion driver. This patch
disables write_same for this driver and the vmware subsystem_vendor which
ensures that manual zeroing out is used instead.
BugLink: http://bugs.launchpad.net/bugs/1371591
---
drivers/message/fusion/mptspi.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/message/fusion/mptspi.c b/drivers/message/fusion/mptspi.c
index 787933d..613231c 100644
--- a/drivers/message/fusion/mptspi.c
+++ b/drivers/message/fusion/mptspi.c
@@ -1419,6 +1419,11 @@ mptspi_probe(struct pci_dev *pdev, const struct pci_device_id *id)
goto out_mptspi_probe;
}
+ /* VMWare emulation doesn't properly implement WRITE_SAME
+ */
+ if (pdev->subsystem_vendor == 0x15AD)
+ sh->no_write_same = 1;
+
spin_lock_irqsave(&ioc->FreeQlock, flags);
/* Attach the SCSI Host to the IOC structure
As a workaround one can do the following:

# Set the scsi disk max_write_same_blocks to 0 to disable write_same.
(Your paths may vary...)
echo 0 >
/sys/devices/pci0000:00/0000:00:10.0/host32/target32:0:0/32:0:0:0/scsi_disk/32:0:0:0/max_write_same_blocks

# Force the dm device to reload (thus calling dm_table_set_restrictions
and checking for the new max_write_same_blocks value)
dmsetup table /dev/dm-0 save
dmsetup suspend /dev/dm-0; dmsetup reload /dev/dm-0 save; dmsetup resume
/dev/dm-0

# Now the dm device shows write_same_max_bytes as 0
cat /sys/block/dm-0/queue/write_same_max_bytes

# Run the test case in the original bug, it now passes.

So a few questions:
1) Does this workaround make sense? Perhaps there is an easier way?
2) Do we expect changing max_write_same_blocks at the scsi_disk level to
propagate the right write_same flags to other layers such as dm?
3) In light of this workaround, does this patch still make sense? Would
there be a better layer to fix this?

Thanks,
--chris j arges
Martin K. Petersen
2014-09-23 23:28:04 UTC
Permalink
Chris> 1) Does this workaround make sense? Perhaps there is an easier
Chris> way?

One option is to ship a udev rule that disables write same on VMware
disks. However, I don't have a fundamental problem having a workaround
for this in the kernel.

Chris> 2) Do we expect changing max_write_same_blocks at the scsi_disk
Chris> level to propagate the right write_same flags to other layers
Chris> such as dm?

No, there's currently no way to communicate that the underlying topology
has changed.
--
Martin K. Petersen Oracle Linux Engineering
Christoph Hellwig
2014-09-24 08:18:37 UTC
Permalink
Post by Martin K. Petersen
Chris> 1) Does this workaround make sense? Perhaps there is an easier
Chris> way?
One option is to ship a udev rule that disables write same on VMware
disks. However, I don't have a fundamental problem having a workaround
for this in the kernel.
This sounds way to scary to me.

I'd like to add Chris latests patch and Cc it to stable.

Martin, given that you're ok with it can you give me a review?
Martin K. Petersen
2014-09-24 15:11:42 UTC
Permalink
Christoph> I'd like to add Chris latests patch and Cc it to stable.

Christoph> Martin, given that you're ok with it can you give me a

Reviewed-by: Martin K. Petersen <***@oracle.com>
--
Martin K. Petersen Oracle Linux Engineering
Christoph Hellwig
2014-09-25 17:01:47 UTC
Permalink
Thanks, applied to drivers-for-3.18 with Cc to stable.

Loading...