Does plt.scatter work with masked offsets?
There are 2 things that I want to check mainly:
- Does
scatter
work fine with masked arrays as input? Does
scatter
work if I reset the coordinates usingset_offsets
with masked arrays?%load_ext autoreload %autoreload 3
import numpy as np import matplotlib.pyplot as plt
1. Masked arrays as input
This test uses numpy masked arrays as direct inputs to scatter
.
x = np.ma.array([1, 2, 3, 4, 5], mask=[0, 0, 1, 1, 0]) y = np.ma.array([1, 2, 3, 4, 5]) plt.scatter(x, y)
<matplotlib.collections.PathCollection at 0x7f94ada19f90>
As in the figure, the mask on x
does work fine. So scatter
does indeed support masked arrays.
2. Masked arrays in set_offsets
Now, instead if masked arrays were passed in set_offsets
to change the offsets for scatter
, I suspect that it won’t work. To check this, we first plot using the masked data as above.
x = np.ma.array([1, 2, 3, 4, 5], mask=[0, 0, 1, 1, 0]) y = np.arange(1, 6) fig, ax = plt.subplots() scat = ax.scatter(x, y)
Now, if we update the x
data and plot it again, it should still show only the 3 points it showed before.
x += 1 print(f"Updated x: {x}") scat.set_offsets(np.ma.column_stack([x, y])) print(scat.get_offsets())
Updated x: [2 3 -- -- 6] [[2. 1.] [3. 2.] [3. 3.] [4. 4.] [6. 5.]]
As we see, the new offsets set by set_offsets
doesn’t have the mask information from the original input and will plot all the points instead of plotting only 3 points.
3. Bug fix
This bug would have a simple fix. In /lib/matplotlib/collections.py the following patch should add support for masked array in offsets.
@@ -545,9 +545,9 @@ class Collection(artist.Artist, cm.ScalarMappable): offsets = np.asanyarray(offsets) if offsets.shape == (2,): # Broadcast (2,) -> (1, 2) but nothing else. offsets = offsets[None, :] - self._offsets = np.column_stack( - (np.asarray(self.convert_xunits(offsets[:, 0]), float), - np.asarray(self.convert_yunits(offsets[:, 1]), float))) + self._offsets = np.ma.column_stack( + (np.asanyarray(self.convert_xunits(offsets[:, 0]), float), + np.asanyarray(self.convert_yunits(offsets[:, 1]), float))) self.stale = True
3.1. Test for bug fix
A simple test for this would be to check if the newly set offsets array is masked or not.
def test_masked_set_offsets(): x = np.ma.array([1, 2, 3, 4, 5], mask=[0, 0, 1, 1, 0]) y = np.arange(1, 6) fig, ax = plt.subplots() scat = ax.scatter(x, y) x += 1 scat.set_offsets(np.ma.column_stack([x, y])) assert np.ma.is_masked(scat.get_offsets())