Skip to content

blackbishop313/phenix-challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

phenix-challenge

The project attempts to provide a solution for the Carrefour Phenix Challenge. The application runs calculations of sales volume and sales amount by store.

Assumptions

  • The files do not contain duplicate data
  • The structures are safe. The calculation tasks fail if there are some parsing errors
  • For each Transaction file we have corresponding Product files.

The output results are described bellow:

Day Calculation

It takes Transaction file of the day and aggregate sales volume of each product by store. The result give only top 100 best sales by store and is saved as :

top_100_ventes_${store_uuid}_YYYYMMDD.data

The structure of the result file : productId|salesVolume

Week Calculation

It takes Transaction and Product files of the last 7 days (starting from today date). The Transaction files are then joined to Product files using the date in file name (fileDate) and the storeUuid in order to calculate the turnover generated by each product in each store. The result give only top 100 best sales by store and is saved as :

top_100_ca_${store_uuid}_YYYYMMDD-J7.data

The structure of the result file : productId|salesAmount

Prerequisites

To run this application you must have Java installed on your machine with version 8

Java 8

Usage

Get the packaged JAR in the target folder and execute it as follow :

phenix-challenge 1.0.0
Usage: phenix-challenge [options]

  --input.data.folder <value>
                           input data folder
  --output.result.folder <value>
                           output result folder

--input.data.folder argement must point to folder that contains Transaction and Product data files. --output.result.folder argument must contain the path to a valid folder where calculation results will be saved

Both arguments are mandatory.

Example :

java -jar phenix-challenge-1.0.0-RC.jar --input.data.folder =/path/to/data/folder --output.result.folder=/path/to/folder

Input configuration

In order to retrieve and parse input files, we read all the properties needed from resource config file : configs.yaml. If files structure changes one must modify that file in order to run the tasks. Default configuration in configs.yaml

- fileType: Transaction
  fileNamePattern: "(transactions_)(${file_date})(.data)"
  fileDatePattern: "yyyyMMdd"
  fileProperties:
    delimiter: "|"
    hasHeader: false
    quote: "\""
    escape: "\\"
    charset: "UTF-8"
- fileType: Product
  fileNamePattern: "(reference_prod-)(${store_uuid})(_)(${file_date})(.data)"
  fileDatePattern: "yyyyMMdd"
  fileProperties:
    delimiter: "|"
    hasHeader: false
    quote: "\""
    escape: "\\"
    charset: "UTF-8"
- fileType: SalesResult
  fileNamePattern: "top_100_${measure_type}_${aggregation_level}_${file_date}${delta}.data"
  fileDatePattern: "yyyyMMdd"
  fileProperties:
    delimiter: "|"
    hasHeader: false
    quote: "\""
    escape: "\\"
    charset: "UTF-8"

Installing

Clone

Clone this repo to your local machine using :

git clone https://github.com/blackbishop313/phenix-challenge

Development

Requirements :

  • JVM
  • Scala (v 2.11 or above)
  • Maven

Running the tests

Run test with maven command

mvn test

Built With

  • Maven - Dependency Management

TODO

  • complete tests
  • improve memory usage (actually using Scala Stream to handle large data files could cause memory problems due to the fact that Stream memorize values).
  • improve file importing (add escaping, quotes, handle line parsing errors)

Authors

License

This project is licensed under the Apache License- see the LICENSE.md file for details

About

Provides solution for the Carrefour Phenix Challenge.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages