Name: app-malicious-domains
Owner: H2O.ai
Description: Domain name classifier looking for good vs. possibly malicious providers
Created: 2016-03-08 21:02:49.0
Updated: 2018-05-04 16:53:28.0
Pushed: 2018-05-04 16:53:26.0
Homepage: null
Size: 51089
Language: HTML
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This example builds a machine learning application using AWS Lambda, which is an Amazon service that automatically manages compute resources for code that is request-driven. It simplifies the process of scaling microservices, eliminating the need to provision or manage servers. The front-end of the application is a web browser, while the backend is a Lambda function, with components that include a function handler, Jython code for feature munging, and an H2O model POJO. The front-end and back-end communicate via a REST endpoint.
The application classifies domain names as legitimate or malicious. Malicious domains earn their label by engaging in malicious activity, such as botnets, phishing, and malware hosting. In order to defeat security systems, attackers use domain names that are generated by algorithms. To detect domains which may be malicious, the app builds a model based on linguistic features that distinguish regular domains from those that are algorithmically generated.
| Legitimate domains | Malicious domains | |:——–|:————-| |h2o | zyxgifnjobqhzptuodmzov | | zen-cart | c3p4j7zdxexg1f2tuzk117wyzn | | fedoraforum | batdtrbtrikw |
The “Make Data Products” presentation given at the Silicon Valley Big Data Science meetup on March 17, 2016 references this repo.
| Data | Offline | Front-end | Back-end | |———-|————————|—|—| | legit-dga_domains.csv | build.gradle | src/main/webapp/index.html | lib/h2o-genmodel.jar (downloaded) | | src/main/resources/words.txt | h2o-model.py|src/main/webapp/app.js | lib/aws-lambda-java-core-1.0.0.jar | | | | | lib/jython-standalone-2.7.0.jar | | | | | src/main/java/Classify.java | | | | | src/main/java/MaliciousDomainModel.java (generated) | | | | | src/main/resources/pymodule.py |
adle wrapper
http://www.h2o.ai/download/h2o/python
gradlew build
gradlew jettyRunWar -x generateModel
(If you don't include the -x generateModel above, you will build the models and deployment package again, which is time consuming.)
usion Matrix (Act/Pred) for max f1 @ threshold = 0.493541945983:
0 1 Error Rate
- ----- ----- ------- ---------------
15889 315 0.0194 (315.0/16204.0)
346 10043 0.0333 (346.0/10389.0)
l 16235 10358 0.0249 (661.0/26593.0)
rl -X POST -d "{\"domain\":\"plzdonthackmekthxbye\"}" <api_endpoint_url>
abel": 1,
lass0Prob": 0.002564083122440164,
lass1Prob": 0.9974359168775598,
ntercept": -14.94132841574946,
ength": 29.841565204329598,
ntropy": 11.178560649883826,
roVowels": -1.7679609134401084,
umWords": -18.347249579636706
Check if the function already exists and, if not, try again. For slower internet connections, try uploading the .zip file with a S3 link in the Code tab.
In the AWS Lambda console, click the Configuration tab. Click Advanced settings and increase the timeout field.
This is due to Lambda's cold start. Keep attempting domain names and after no more than a minute, the webapp should be responsive.
Performance was tested with JMeter on a MacBook Pro with 2.5 GHz Intel Core i7 on wireless internet connection over the office WAN.
Before testing, a warm-up cycle of 100 loops was run. Times are in milliseconds. The body data of the POST request was {“domain”:“plzdonthackmekthxbye”}.
| Memory (MB) | Threads | Loops | Samples | Average | Median | 90% | 95% | 99% | Min | Max | Error % | Throughput (calls/sec) | |————-|———|——-|———|———|——–|—–|—–|——|—–|——-|———|————————| | 512 | 1 | 10000 | 10000 | 113 | 102 | 118 | 138 | 426 | 85 | 2137 | 0 | 8.4 | | 512 | 10 | 1000 | 10000 | 170 | 102 | 148 | 182 | 334 | 85 | 30330 | 0.18 | 44 | | 512 | 100 | 100 | 10000 | 392 | 149 | 643 | 943 | 1738 | 85 | 30307 | 0.43 | 168 |
The gradle distribution shows how to do basic war and jetty plugin operations.
http://docs.aws.amazon.com/lambda/latest/dg/create-deployment-pkg-zip-java.html