Skip to content

unravelin/code-test-mle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ravelin Machine Learning Test

Part 1:

The customers.jsonl file include a list of json profiles representing fictional customers from an ecommerce company. create_dataset.py is a short python script than can be run to create a larger dataset using the customers.jsonl as a template.

  • Run the create_dataset.py script. An integer parameter can be used to specify the number of rows of data.
  • Use your newly generated historic dataset (NOT the customers.jsonl template) to help build part of a training dataset for a payments fraud prevention model.
  • Each row in the dataset will be used as a row of training data. We'd like to build a feature that captures the number of transactions that have occurred in the past 24 hours from the same paymentMethodIssuer of the training row transaction.
  • This will help us detect if there is an increase in the "velocity" of transactions coming from a given payment method issuer (bank).
  • You can use any tool you would like, but we'd like you to stick to SQL & Python, as much as possible.
  • Write up some analysis of your implementation and how it scales as the number of rows in the dataset increases. We're more interested in a well written discussion, rather than a solution that processes the most number of rows.

Part 2:

The 3rd file create_model.py can be used to create a basic fraud prevention pytorch model for inference.

  • Create a simple server for model inference that will accept a featureset and produce a prediction as to whether or not a transaction is fraudulent.
  • Comment on some statistics related to your model deployment.

We're looking for...

  • Readable and scaleable code
  • A clear discussion of the solution, analysis and conclusions. Talk us through the motivation behind your solution.
  • Please don't spend more than 1-2 hours on this task. You may find that there are more aspects of the model/data than you can realistically investigate and that's fine. If that is the case, please just describe what your next steps might be if there were more time allocated to the task.

Questions

Please reach out to us, if you have any questions. We would be happy to clarify anything and/or give you some guidance to the extent that we can help :)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages