python - How to get only the first occurrence of each increasing value in numpy array? - Stack Overflow

admin2025-05-01  1

While working on first-passage probabilities, I encountered this problem. I want to find a NumPythonic way (without explicit loops) to leave only the first occurrence of strictly increasing values in each row of a numpy array, while replacing repeated or non-increasing values with zeros. For instance, if

arr = np.array([
    [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
    [1, 1, 2, 2, 2, 3, 2, 2, 3, 3, 3, 4, 4],
    [3, 2, 1, 2, 1, 1, 2, 3, 4, 5, 4, 3, 2]])

I would like to get as output:

out = np.array([
    [1, 0, 0, 2, 0, 0, 3, 0, 0, 4, 0, 5, 0],
    [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0],
    [3, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0]])

While working on first-passage probabilities, I encountered this problem. I want to find a NumPythonic way (without explicit loops) to leave only the first occurrence of strictly increasing values in each row of a numpy array, while replacing repeated or non-increasing values with zeros. For instance, if

arr = np.array([
    [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
    [1, 1, 2, 2, 2, 3, 2, 2, 3, 3, 3, 4, 4],
    [3, 2, 1, 2, 1, 1, 2, 3, 4, 5, 4, 3, 2]])

I would like to get as output:

out = np.array([
    [1, 0, 0, 2, 0, 0, 3, 0, 0, 4, 0, 5, 0],
    [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0],
    [3, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0]])
Share Improve this question edited Jan 2 at 19:15 alpelito7 asked Jan 2 at 18:37 alpelito7alpelito7 4793 silver badges11 bronze badges 0
Add a comment  | 

2 Answers 2

Reset to default 3

Maximum can be accumulated per-row:

>>> arr
array([[1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
       [1, 1, 2, 2, 2, 3, 2, 2, 3, 3, 3, 4, 4],
       [3, 2, 1, 2, 1, 1, 2, 3, 4, 5, 4, 3, 2]])
>>> np.maximum.accumulate(arr, axis=1)
array([[1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 5, 5],
       [1, 1, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4],
       [3, 3, 3, 3, 3, 3, 3, 3, 4, 5, 5, 5, 5]])

Then you can easily mask out non-increasing values:

>>> m_arr = np.maximum.accumulate(arr, axis=1)
>>> np.where(np.diff(m_arr, axis=1, prepend=0), arr, 0)
array([[1, 0, 0, 2, 0, 0, 3, 0, 0, 4, 0, 5, 0],
       [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0],
       [3, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0]])

Here's one approach:

m = np.hstack(
    (np.ones((arr.shape[0], 1), dtype=bool),
     np.diff(np.fmax.accumulate(arr, axis=1)) >= 1)
     )

out = np.zeros_like(arr)

out[m] = arr[m]

Output:

array([[1, 0, 0, 2, 0, 0, 3, 0, 0, 4, 0, 5, 0],
       [1, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0],
       [3, 0, 0, 0, 0, 0, 0, 0, 4, 5, 0, 0, 0]])

Explanation

  • Use np.fmax + np.ufunc.accumulate to get running maximum for each row.
  • Now, check where np.diff is bigger than or equal to 1.
  • Use np.hstack to prepend a column with True for first column (via np.ones).
  • Finally, initialize an array with zeros with same shape as arr (via np.zeros_like) and set values for the mask.
转载请注明原文地址:http://www.anycun.com/QandA/1746102678a91703.html