python - Why keep NumPy RuntimeWarning - Stack Overflow

admin2025-05-02  1

Here is a sample data, even there is no negative or np.nan, it still show error message:

Data:

   gvkey  sale  ebit
4   1000  44.8  16.8
5   1000  53.2  11.5
6   1000  42.9   6.2
7   1000  42.4   0.9
8   1000  44.2   5.3
9   1000  51.9   9.7

Function:

def calculate_ln_values(df):
    conditions_ebit = [
        df['ebit'] >= 0.0,
        df['ebit'] <  0.0
    ]
    choices_ebit = [
        np.log(1 + df['ebit']),
        np.log(1 - df['ebit']) * -1
    ]
    df['lnebit'] = np.select(conditions_ebit, choices_ebit, default=np.nan)
    
    conditions_sale = [
        df['sale'] >= 0.0,
        df['sale'] <  0.0
    ]
    choices_sale = [
        np.log(1 + df['sale']),
        np.log(1 - df['sale']) * -1
    ]
    df['lnsale'] = np.select(conditions_sale, choices_sale, default=np.nan)
    return df

Run

calculate_ln_values(data)

Error Warning:

C:\Users\quoc\anaconda3\envs\uhart\Lib\site-packages\pandas\core\arraylike.py:399: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
C:\Users\quoc\anaconda3\envs\uhart\Lib\site-packages\pandas\core\arraylike.py:399: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)

I would very appreciate if someone could help me this issue

---- Edit: reply to Answer of @Emi OB and @Quang Hoang: ---------------

The formula as in the paper is:

ln(1+EBIT) if EBIT ≥ 0

-ln(1-EBIT) if EBIT < 0

so my code:

np.log(1 + df['ebit']),
np.log(1 - df['ebit']) * -1

follows the paper.

The part np.log(1 - df['ebit']) is impossible to be negative since it fall under the condition of ebit < 0.

Here is a sample data, even there is no negative or np.nan, it still show error message:

Data:

   gvkey  sale  ebit
4   1000  44.8  16.8
5   1000  53.2  11.5
6   1000  42.9   6.2
7   1000  42.4   0.9
8   1000  44.2   5.3
9   1000  51.9   9.7

Function:

def calculate_ln_values(df):
    conditions_ebit = [
        df['ebit'] >= 0.0,
        df['ebit'] <  0.0
    ]
    choices_ebit = [
        np.log(1 + df['ebit']),
        np.log(1 - df['ebit']) * -1
    ]
    df['lnebit'] = np.select(conditions_ebit, choices_ebit, default=np.nan)
    
    conditions_sale = [
        df['sale'] >= 0.0,
        df['sale'] <  0.0
    ]
    choices_sale = [
        np.log(1 + df['sale']),
        np.log(1 - df['sale']) * -1
    ]
    df['lnsale'] = np.select(conditions_sale, choices_sale, default=np.nan)
    return df

Run

calculate_ln_values(data)

Error Warning:

C:\Users\quoc\anaconda3\envs\uhart\Lib\site-packages\pandas\core\arraylike.py:399: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
C:\Users\quoc\anaconda3\envs\uhart\Lib\site-packages\pandas\core\arraylike.py:399: RuntimeWarning: invalid value encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)

I would very appreciate if someone could help me this issue

---- Edit: reply to Answer of @Emi OB and @Quang Hoang: ---------------

The formula as in the paper is:

ln(1+EBIT) if EBIT ≥ 0

-ln(1-EBIT) if EBIT < 0

so my code:

np.log(1 + df['ebit']),
np.log(1 - df['ebit']) * -1

follows the paper.

The part np.log(1 - df['ebit']) is impossible to be negative since it fall under the condition of ebit < 0.

Share Improve this question edited Jan 2 at 16:08 PTQuoc asked Jan 2 at 12:43 PTQuocPTQuoc 1,0935 silver badges14 bronze badges 1
  • Array values are calculated in full before being passed to np.select. Same goes for a np.where call. – hpaulj Commented Jan 2 at 21:03
Add a comment  | 

2 Answers 2

Reset to default 2

The problem is in this block of code:

    choices_ebit = [
        np.log(1 + df['ebit']),
        np.log(1 - df['ebit']) * -1
    ]

Here, you are calculating both formulas, for when ebit is positive and when it's negative, and storing them in choices_ebit. However, when ebit>=1, the second one will give you the runtime warning, and when ebit<=-1, the first one will give your the runtime warning.

In order to avoid calculating both formulas, you can factor them out into one with abs() on the one hand, and np.sign() on the other:

    df['lnebit'] = np.log(1 + df['ebit'].abs()) * np.sign(df['ebit'])

This meets your requirements:

  • when ebit>=0, sign(ebit) == 1 and abs(ebit) == ebit, so that resolves to log(1+ebit)
  • when ebit<=, sign(ebit) == -1 and abs(ebit) == -ebit, so that resolves to -log(1-ebit)

You are getting this error as you are passing negative values into np.log() when you do the below:

np.log(1 - df['ebit']) * -1

and

np.log(1 - df['sale']) * -1

I imagine the * -1 part was you trying to avoid passing in a negative, however you are doing this outside of the log function, hence the error. For example, if 1 - df['ebit'] = n, your code is first trying to do log(n) then multiply that by -1. If n is negative (as it often is in your code), this is not possible.

You want to re-write your log calls such that the * -1 is inside the log, like:

np.log((1 - df['sale']) * -1)

Edit thanks to @Quang Hoang

Using:

np.log((1 - df['sale']).abs())

Is a more robust way of achieving what you're after, as using * -1 will still cause issues with negative values if there is a value in df['sale'] that is less than 1. Using .abs() takes the absolute value of a column, so the value regardless of sign, which will avoid any negative values being passed into np.log()

转载请注明原文地址:http://www.anycun.com/QandA/1746119475a91937.html