Analysis Plugins

Summary

The objective is to integrate an analysis mechanism that will run different tests, will report a status for each of them with a meaningful comment. The mechanism provided by opal will be analysis independent and analysis routines will be provided by plugins. A generic analysis plugin will be based on Github repo (routines will be downloaded from Github, repositories are configurable).

Rational

Improve the data quality by reusing and sharing data quality inspection routines.

References

User Stories

Describe simply who is doing what and how to obtain the result.

#	Who	What	How	Result
1
2
...

Scope

Plugin
GWT UI
R server

Design and Implementation Plan

Despite Opal only support R as data processing backend, it would be good to have an data processing engine independent analysis plugin definition. Then a R specific SPI will be designed.

The data analysis flow will:

on startup, opal discovers the opal-analysis plugins and makes them available in the UI (list of plugins with their associated schema form)
user goes to a table page where he/she can find a new tab called "Analysis"
the "Analysis" tab offers:
- to prepare an analysis by selecting a routine, settings its parameters and specifying (option) the subset of the table on which the routine is to be executed
- to view the previous analysis with their associated report: status, message, times and report in markdown or in pdf
- an analysis batch (set of routines with parameters) can be saved for reuse.
when an analysis batch is sent:
- a project task is created /project/{prj}/_analyse
- a R session is created, the table (or its subset, represented by a view wrapper) is then pushed to the R session as a tibble
- the R analysis service is then triggered with the tibble symbol and the R session provided
- each of the R analysis routine must return a result including: status, message, times and report in markdown or in pdf. These results are persisted (data files or orientdb?)
- the user can follow the progress of the analysis batch in the "Tasks" tab

Domain

Analysis SPI

The analysis SPI (and its R specific version) will define:

AnalysisTemplate that describes the analysis routine (name, title, description (markdown), JSON schema form)
Analysis represents an instance of an analysis request, with a name, associated template name, and parameters
AnalysisResult reports the results of an Analysis (timestamps, status, output message, report location). Sometimes a single analysis has multiple parameters that leads to multiple sub-results. In this case there is a global result (if one sub-result is Failed, then global result is Failed as well) and a list of sub-results.

Analysis plugin implementations

The data analysis plugin implementations will have access to the convenient classes that allows to perform R operations from Opal (R script, data assignment, file transfers).

A generic R-based analysis plugin implementation could be provided. This implementation could download R routines from one or more github repositories (file layout would follow some naming conventions). This way the same plugin (configured in the opal plugin management page) could propose a set of analysis that is extensible.

Web Services

REST	Description
`GET /analysis-plugins`	List available analysis plugins and for each of them the analysis templates
`GET /analysis-plugin/{plg}`	Get a specific analysis plugin DTO
`POST /project/{prj}/commands/_analyse`	Launch a analysis task (can be multiple sub-analysis, but only on one table)
`GET /project/{prj}/analyses`	List all the recorded analyses for the project (filtered by permission)
`GET /project/{prj}/table/{tbl}/analyses`	List all the recorded analyses for the table
`GET /project/{prj}/table/{tbl}/analysis/{id}`	Get a specific table recorded analysis
`GET /project/{prj}/table/{tbl}/analysis/{id}/results`	Get a specific table recorded analysis results
`GET /project/{prj}/table/{tbl}/analysis/{id}/result/{rid}`	Get a specific table recorded analysis result

Permissions

A user needs to have the permission to see table values in order to make analysis (pretty much like the reports).

Analyse Task

The analysis is executed asynchronously by the task system. See how done with the Report.

Analysis Results Persistence

The analysis results are to be saved in Opal's internal database (OrientDB). Result files, if any, are to stored separatly in the folder OPAL_HOME/data/analyses/{id}/results/{rid}.

Analysis Life Cycle

On Analysis deletion, all associated results shall be deleted
On Table deletion, all associated analysis with their result shall be deleted

UI Mockups

Analyses Tab

The list is a list of AnalysesResults.

on Run, the analyse task is submitted and the user is informed by a notification growl,
on View, the analysis type and parameters are displayed (same as the New Analysis dialog but read-only) with the details of the result,
on Duplicate, a New Analysis dialog is opened with the same field values,
on Delete, the analysis result is removed (and associated documents and permissions).

New Analysis Dialog

On analysis type selection, the Parameters section is rebuilt from the corresponding schemaform.

Analysis Result dialog

Viewing analysis result shows the analysis type and parameters and the last execution results. Depending whether there are sub-results, a Details section shows the status of each of these tests.

Some results can have sub result items:

Test/Demo Plan

How can the feature be tested or demonstrated. It is important to describe this in fairly great details so anyone can perform the demo or test.

Unresolved Issues

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Analysis Plugins

Summary

Rational

References

User Stories

Scope

Design and Implementation Plan

Domain

Analysis SPI

Analysis plugin implementations

Web Services

Permissions

Analyse Task

Analysis Results Persistence

Analysis Life Cycle

UI Mockups

Analyses Tab

New Analysis Dialog

Analysis Result dialog

Test/Demo Plan

Unresolved Issues

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally