Skip to content

Commit 37f6fc5

Browse files
author
Xuzhou Qin
committed
v0.4.1
Signed-off-by: Xuzhou Qin <xuzhou.qin@jcdecaux.com>
1 parent 40325ac commit 37f6fc5

File tree

2 files changed

+44
-34
lines changed

2 files changed

+44
-34
lines changed

CHANGELOG.md

Lines changed: 42 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,34 @@
1-
## 0.4.1 (2020-01-15)
1+
## 0.4.1 (2020-02-13)
2+
Changes:
3+
- Changed benchmark unit of time to *seconds* (#88)
4+
5+
Fixes:
6+
- The master URL of SparkSession can now be overwritten in local environment (#74)
7+
- `FileConnector` now lists path correctly for nested directories (#97)
8+
29
New features:
310
- Added [Mermaid](https://mermaidjs.github.io/#/) diagram generation to **Pipeline** (#51)
4-
- Added `showDiagram()` method to **Pipeline** that prints the Mermaid code and generates the
5-
live editor URL 🎩🐰✨ (#52)
11+
- Added `showDiagram()` method to **Pipeline** that prints the Mermaid code and generates the live editor URL 🎩🐰✨ (#52)
612
- Added **Codecov** report and **Scala API doc**
13+
- Added `delete` method in `JDBCConnector` (#82)
14+
- Added `drop` method in `DBConnector` (#83)
15+
- Added support for both of the following two Spark configuration styles in SETL builder (#86)
16+
```hocon
17+
setl.config {
18+
spark {
19+
spark.app.name = "my_app"
20+
spark.sql.shuffle.partitions = "1000"
21+
}
22+
}
23+
24+
setl.config_2 {
25+
spark.app.name = "my_app"
26+
spark.sql.shuffle.partitions = "1000"
27+
}
28+
```
29+
30+
Others:
31+
- Improved test coverage
732

833
## 0.4.0 (2020-01-09)
934
Changes:
@@ -26,46 +51,37 @@ Others:
2651
- Optimized **PipelineInspector** (#33)
2752

2853
## 0.3.5 (2019-12-16)
29-
- BREAKING CHANGE: replace the Spark compatible version by the Scala compatible version in the artifact ID.
30-
The old artifact id **dc-spark-sdk_2.4** was changed to **dc-spark-sdk_2.11** (or **dc-spark-sdk_2.12**)
54+
- BREAKING CHANGE: replace the Spark compatible version by the Scala compatible version in the artifact ID. The old artifact id **dc-spark-sdk_2.4** was changed to **dc-spark-sdk_2.11** (or **dc-spark-sdk_2.12**)
3155
- Upgraded dependencies
3256
- Added Scala 2.12 support
3357
- Removed **SparkSession** from Connector and SparkRepository constructor (old constructors are kept but now deprecated)
3458
- Added **Column** type support in FindBy method of **SparkRepository** and **Condition**
35-
- Added method **setConnector** and **setRepository** in **Setl** that accept
36-
object of type Connector/SparkRepository
59+
- Added method **setConnector** and **setRepository** in **Setl** that accept object of type Connector/SparkRepository
3760

3861
## 0.3.4 (2019-12-06)
3962
- Added read cache into spark repository to avoid consecutive disk IO.
40-
- Added option **autoLoad** in the Delivery annotation so that *DeliverableDispatcher* can still handle the dependency
41-
injection in the case where the delivery is missing but a corresponding
42-
repository is present.
63+
- Added option **autoLoad** in the Delivery annotation so that *DeliverableDispatcher* can still handle the dependency injection in the case where the delivery is missing but a corresponding repository is present.
4364
- Added option **condition** in the Delivery annotation to pre-filter loaded data when **autoLoad** is set to true.
44-
- Added option **id** in the Delivery annotation. DeliveryDispatcher will match deliveries by the id in addition to
45-
the payload type. By default the id is an empty string ("").
46-
- Added **setConnector** method in DCContext. Each connector should be delivered with an ID. By default the ID will be its
47-
config path.
65+
- Added option **id** in the Delivery annotation. DeliveryDispatcher will match deliveries by the id in addition to the payload type. By default the id is an empty string ("").
66+
- Added **setConnector** method in DCContext. Each connector should be delivered with an ID. By default the ID will be itsconfig path.
4867
- Added support of wildcard path for SparkRepository and Connector
4968
- Added JDBCConnector
5069

5170
## 0.3.3 (2019-10-22)
5271
- Added **SnappyCompressor**.
53-
- Added method **persist(persistence: Boolean)** into **Stage** and **Factory** to.
54-
activate/deactivate output persistence. By default the output persistence is set to *true*.
72+
- Added method **persist(persistence: Boolean)** into **Stage** and **Factory** to activate/deactivate output persistence. By default the output persistence is set to *true*.
5573
- Added implicit method `filter(cond: Set[Condition])` for Dataset and DataFrame.
5674
- Added `setUserDefinedSuffixKey` and `getUserDefinedSuffixKey` to **SparkRepository**.
5775

5876
## 0.3.2 (2019-10-14)
59-
- Added **@Compress** annotation. **SparkRepository** will compress all columns having this annotation by
60-
using a **Compressor** (the default compressor is **XZCompressor**)
77+
- Added **@Compress** annotation. **SparkRepository** will compress all columns having this annotation by using a **Compressor** (the default compressor is **XZCompressor**)
6178
```scala
6279
case class CompressionDemo(@Compress col1: Seq[Int],
6380
@Compress(compressor = classOf[GZIPCompressor]) col2: Seq[String])
6481
```
6582

6683
- Added interface **Compressor** and implemented **XZCompressor** and **GZIPCompressor**
67-
- Added **SparkRepositoryAdapter[A, B]**. It will allow a **SparkRepository[A]** to write/read a data store of type
68-
**B** by using an implicit **DatasetConverter[A, B]**
84+
- Added **SparkRepositoryAdapter[A, B]**. It will allow a **SparkRepository[A]** to write/read a data store of type **B** by using an implicit **DatasetConverter[A, B]**
6985
- Added trait **Converter[A, B]** that handles the conversion between an object of type A and an object of type **B**
7086
- Added abstract class **DatasetConverter[A, B]** that extends a **Converter[Dataset[A], Dataset[B]]**
7187
- Added auto-correction for `SparkRepository.findby(conditions)` method when we filter by case class field name instead of column name
@@ -77,8 +93,7 @@ case class CompressionDemo(@Compress col1: Seq[Int],
7793
- Added sequential mode in class `Stage`. Use can turn in on by setting `parallel` to *true*.
7894
- Added external data flow description in pipeline description
7995
- Added method `beforeAll` into `ConfigLoader`
80-
- Added new method `addStage` and `addFactory` that take a class object as input. The instantiation will be handled
81-
by the stage.
96+
- Added new method `addStage` and `addFactory` that take a class object as input. The instantiation will be handled by the stage.
8297
- Removed implicit argument encoder from all methods of Repository trait
8398
- Added new get method to **Pipeline**: `get[A](cls: Class[_ <: Factory[_]): A`.
8499

@@ -97,8 +112,7 @@ case class CompressionDemo(@Compress col1: Seq[Int],
97112
```
98113
- Added an optional argument `suffix` in `FileConnector` and `SparkRepository`
99114
- Added method `partitionBy` in `FileConnector` and `SparkRepository`
100-
- Added possibility to filter by name pattern when a FileConnector is trying to read a directory.
101-
To do this, add `filenamePattern` into the configuration file
115+
- Added possibility to filter by name pattern when a FileConnector is trying to read a directory. To do this, add `filenamePattern` into the configuration file
102116
- Added possibility to create a `Conf` object from Map.
103117
```scala
104118
Conf(Map("a" -> "A"))
@@ -122,15 +136,12 @@ case class CompressionDemo(@Compress col1: Seq[Int],
122136
- Added a second argument to CompoundKey to handle primary and sort keys
123137

124138
## 0.2.7 (2019-06-21)
125-
- Added `Conf` into `SparkRepositoryBuilder` and changed all the set methods
126-
of `SparkRepositoryBuilder` to use the conf object
139+
- Added `Conf` into `SparkRepositoryBuilder` and changed all the set methods of `SparkRepositoryBuilder` to use the conf object
127140
- Changed package name `com.jcdecaux.setl.annotations` to `com.jcdecaux.setl.annotation`
128141

129142
## 0.2.6 (2019-06-18)
130-
- Added annotation `ColumnName`, which could be used to replace the current column name
131-
with an alias in the data storage.
132-
- Added annotation `CompoundKey`. It could be used to define a compound key for databases
133-
that only allow one partition key
143+
- Added annotation `ColumnName`, which could be used to replace the current column name with an alias in the data storage.
144+
- Added annotation `CompoundKey`. It could be used to define a compound key for databases that only allow one partition key
134145
- Added sheet name into arguments of ExcelConnector
135146

136147
## 0.2.5 (2019-06-12)
@@ -155,8 +166,7 @@ that only allow one partition key
155166

156167
## 0.2.0 (2019-05-21)
157168
- Changed spark version to 2.4.3
158-
- Added `SparkRepositoryBuilder` that allows creation of a `SparkRepository` for a given class without creating a
159-
dedicated `Repository` class
169+
- Added `SparkRepositoryBuilder` that allows creation of a `SparkRepository` for a given class without creating a dedicated `Repository` class
160170
- Added Excel support for `SparkRepository` by creating `ExcelConnector`
161171
- Added `Logging` trait
162172

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ You can start working by cloning [this template project](https://github.com/qxzz
2525
<dependency>
2626
<groupId>com.jcdecaux.setl</groupId>
2727
<artifactId>setl_2.11</artifactId>
28-
<version>0.4.0</version>
28+
<version>0.4.1</version>
2929
</dependency>
3030
```
3131

@@ -42,7 +42,7 @@ To use the SNAPSHOT version, add Sonatype snapshot repository to your `pom.xml`
4242
<dependency>
4343
<groupId>com.jcdecaux.setl</groupId>
4444
<artifactId>setl_2.11</artifactId>
45-
<version>0.4.1-SNAPSHOT</version>
45+
<version>0.4.2-SNAPSHOT</version>
4646
</dependency>
4747
</dependencies>
4848
```

0 commit comments

Comments
 (0)