Introduction to Computational Analysis




Pay Notebook Creator: Roy Hyunjin Han0
Set Container: Numerical CPU with TINY Memory for 10 Minutes 0
Total0

Anomaly detection

Jack is building an early-warning system for unusual server activity. He is using the following variables gathered from server access logs:

  • Location of IP address
  • Day of the week
  • Time of day
  • Number of unique URLs accessed by IP address in the last minute
  • Average number of access counts per url
In [3]:
from scripts import make_logs
logs = make_logs(inlierCount=1000, outlierCount=10)
inliers = logs.data[:1000]
outliers = logs.data[1000:]
In [4]:
from sklearn import svm
model = svm.OneClassSVM(nu=0.1, kernel='rbf', gamma=0.1)
In [5]:
model.fit(inliers)
In [6]:
model.predict(inliers[0])
In [7]:
model.predict(outliers[0])
In [8]:
predictions = model.predict(inliers)
errors = predictions[predictions == -1].size 
print 'Error rate: %s %%' % (100 * errors / float(len(predictions)))
In [9]:
model.predict(outliers)