Calculation of outlier score in series_outlier method #147
Lopa2016
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I want to implement the series_outlier method in Python & used the following code
import pandas as pd
import numpy as np
from scipy.stats import norm
Load the data into a DataFrame
data = {
'series': [67.95675, 58.63898, 33.59188, 4906.018, 5.372538, 702.1194, 0.037261, 11161.05, 1.403496, 100.116]
}
df = pd.DataFrame(data)
Function to calculate the outlier score based on custom percentiles
def custom_percentile_outliers(series, p_low=10, p_high=90):
Calculate custom percentiles
percentile_low = np.percentile(series, p_low)
percentile_high = np.percentile(series, p_high)
z_high = norm.ppf(p_high / 100)
Calculate normalization factor
normalization_factor = (2 * z_high - z_low) / (2 * z_high - 2.704)
Calculate outliers score
return series.apply(lambda x: (x - percentile_high) / (percentile_high - percentile_low) * normalization_factor
if x > percentile_high else ((x - percentile_low) / (percentile_high - percentile_low) * normalization_factor
if x < percentile_low else 0))
Apply the custom percentile outlier scoring function
df['outliers'] = custom_percentile_outliers(df['series'], p_low=10, p_high=90)
Display the DataFrame with outliers
print(df)
And getting the following results for the series
0 67.956750 0.000000 1 58.638980 0.000000 2 33.591880 0.000000 3 4906.018000 0.000000 4 5.372538 0.000000 5 702.119400 0.000000 6 0.037261 0.006067 7 11161.050000 -27.776847 8 1.403496 0.000000 9 100.116000 0.000000
While with the series_outlier function I get the below results enter image description here
I referred the github article #136 & also tried implementing & manually calculating with the help of the solution given on stackoverflow - How does Kusto series_outliers() calculate anomaly scores?
I am probably going wrong with the normalization score calculation. Would be great if someone can help
Beta Was this translation helpful? Give feedback.
All reactions