Skip to content

Commit 59ac21c

Browse files
author
Russell Hay
authored
Merge pull request #66 from tableau/development
Releasing 0.2 to master
2 parents e6a0bba + aa93eef commit 59ac21c

32 files changed

+1407
-90
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,4 @@ target/
6363

6464
#Other things
6565
.DS_Store
66+
.idea

.travis.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,10 +12,11 @@ install:
1212
# command to run tests
1313
script:
1414
# Tests
15-
- python test.py
15+
- python setup.py test
1616
# pep8
17-
- pep8 --ignore=E501 .
17+
- pep8 .
1818
# Examples
1919
- (cd "Examples/Replicate Workbook" && python replicateWorkbook.py)
2020
- (cd "Examples/List TDS Info" && python listTDSInfo.py)
21+
- (cd "Examples/GetFields" && python show_fields.py)
2122

CHANGELOG.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
## 0.2 (22 July 2016)
2+
3+
* Added support for loading twbx and tdsx files (#43, #44)
4+
* Added Fields property to datasource (#45)
5+
* Added Example for using the Fields Property (#51)
6+
* Added Ability to get fields used by a specific sheet (#54)
7+
* Code clean up and test reorganization
8+
9+
## 0.1 (29 June 2016)
10+
11+
* Initial Release to the world

Examples/GetFields/World.tds

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../List TDS Info/World.tds

Examples/GetFields/show_fields.py

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
############################################################
2+
# Step 1) Use Datasource object from the Document API
3+
############################################################
4+
from tableaudocumentapi import Datasource
5+
6+
############################################################
7+
# Step 2) Open the .tds we want to inspect
8+
############################################################
9+
sourceTDS = Datasource.from_file('World.tds')
10+
11+
############################################################
12+
# Step 3) Print out all of the fields and what type they are
13+
############################################################
14+
print('----------------------------------------------------------')
15+
print('--- {} total fields in this datasource'.format(len(sourceTDS.fields)))
16+
print('----------------------------------------------------------')
17+
for count, field in enumerate(sourceTDS.fields.values()):
18+
print('{:>4}: {} is a {}'.format(count+1, field.name, field.datatype))
19+
blank_line = False
20+
if field.calculation:
21+
print(' the formula is {}'.format(field.calculation))
22+
blank_line = True
23+
if field.default_aggregation:
24+
print(' the default aggregation is {}'.format(field.default_aggregation))
25+
blank_line = True
26+
27+
if blank_line:
28+
print('')
29+
print('----------------------------------------------------------')

README.md

Lines changed: 37 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,15 +6,26 @@ This repo contains Python source and example files for the Tableau Document API.
66

77
Document API
88
---------------
9-
The Document API provides a supported way to programmatically make updates to Tableau workbook (`.twb`) and datasource (`.tds`) files. If you've been making changes to these file types by directly updating the XML--that is, by XML hacking--this SDK is for you :)
10-
11-
Currently only the following operations are supported:
12-
13-
- Modify database server
14-
- Modify database name
15-
- Modify database user
16-
17-
We don't yet support creating files from scratch. In addition, support for `.twbx` and `.tdsx` files is coming.
9+
The Document API provides a supported way to programmatically make updates to Tableau workbook and data source files. If you've been making changes to these file types by directly updating the XML--that is, by XML hacking--this SDK is for you :)
10+
11+
Features include:
12+
- Support for 9.X, and 10.X workbook and data source files
13+
- Including TDSX and TWBX files
14+
- Getting connection information from data sources and workbooks
15+
- Server Name
16+
- Username
17+
- Database Name
18+
- Authentication Type
19+
- Connection Type
20+
- Updating connection information in workbooks and data sources
21+
- Server Name
22+
- Username
23+
- Database Name
24+
- Getting Field information from data sources and workbooks
25+
- Get all fields in a data source
26+
- Get all feilds in use by certain sheets in a workbook
27+
28+
We don't yet support creating files from scratch, adding extracts into workbooks or data sources, or updating field information
1829

1930

2031
###Getting Started
@@ -34,8 +45,19 @@ Download the `.zip` file that contains the SDK. Unzip the file and then run the
3445
pip install -e <directory containing setup.py>
3546
```
3647

37-
We plan on putting the package in PyPi to make installation easier.
48+
#### Installing the Development Version From Git
49+
50+
*Only do this if you know you want the development version, no guarantee that we won't break APIs during development*
51+
52+
```text
53+
pip install git+https://github.com/tableau/document-api-python.git@development
54+
```
55+
56+
If you go this route, but want to switch back to the non-development version, you need to run the following command before installing the stable version:
3857

58+
```text
59+
pip uninstall tableaudocumentapi
60+
```
3961

4062
###Basics
4163
The following example shows the basic syntax for using the Document API to update a workbook:
@@ -52,7 +74,7 @@ sourceWB.datasources[0].connections[0].username = "benl"
5274
sourceWB.save()
5375
```
5476

55-
With Data Integration in Tableau 10, a datasource can have multiple connections. To access the connections simply index them like you would datasources
77+
With Data Integration in Tableau 10, a data source can have multiple connections. To access the connections simply index them like you would datasources
5678

5779
```python
5880
from tableaudocumentapi import Workbook
@@ -75,13 +97,13 @@ sourceWB.save()
7597
**Notes**
7698

7799
- Import the `Workbook` object from the `tableaudocumentapi` module.
78-
- To open a workbook, instantiate a `Workbook` object and pass the `.twb` file name in the constructor.
79-
- The `Workbook` object exposes a `datasources` collection.
80-
- Each datasource object has a `connection` object that supports a `server`, `dbname`, and `username` property.
100+
- To open a workbook, instantiate a `Workbook` object and pass the file name as the first argument.
101+
- The `Workbook` object exposes a list of `datasources` in the workbook
102+
- Each data source object has a `connection` object that supports a `server`, `dbname`, and `username` property.
81103
- Save changes to the workbook by calling the `save` or `save_as` method.
82104

83105

84106

85107
###Examples
86108

87-
The downloadable package contains an example named `replicateWorkbook.py` (in the folder `\Examples\Replicate Workbook`). This example reads an existing workbook and reads a .csv file that contains a list of servers, database names, and users. For each new user in the .csv file, the code copies the original workbook, updates the `server`, `dbname`, and `username` properties, and saves the workbook under a new name.
109+
The downloadable package contains several example scripts that show more detailed usage of the Document API

contributing.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Contributing
2+
3+
We welcome contributions to this project!
4+
5+
Contribution can include, but are not limited to, any of the following:
6+
7+
* File an Issue
8+
* Request a Feature
9+
* Implement a Requested Feature
10+
* Fix an Issue/Bug
11+
* Add/Fix documentation
12+
13+
Contributions must follow the guidelines outlined on the [Tableau Organization](http://tableau.github.io/) page, though filing an issue or requesting
14+
a feature do not require the CLA.
15+
16+
## Issues and Feature Requests
17+
18+
To submit an issue/bug report, or to request a feature, please submit a [github issue](https://github.com/tableau/document-api-python/issues) to the repo.
19+
20+
If you are submiting a bug report, please provide as much information as you can, including clear and concise repro steps, attaching any necessary
21+
files to assist in the repro. **Be sure to scrub the files of any potentially sensitive information. Issues are public.**
22+
23+
For a feature request, please try to describe the scenario you are trying to accomplish that requires the feature. This will help us understand
24+
the limitations that you are running into, and provide us with a use case to know if we've satisfied your request.
25+
26+
## Fixes, Implementations, and Documentation
27+
28+
For all other things, please submit a PR that includes the fix, documentation, or new code that you are trying to contribute. More information on
29+
creating a PR can be found in the [github documentation](https://help.github.com/articles/creating-a-pull-request/)
30+
31+
If the feature is complex or has multiple solutions that could be equally appropriate approaches, it would be helpful to file an issue to discuss the
32+
design trade-offs of each solution before implementing, to allow us to collectively arrive at the best solution, which most likely exists in the middle
33+
somewhere.

publish.sh

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#!/usr/bin/env bash
2+
3+
set -e
4+
5+
rm -rf dist
6+
python setup.py sdist
7+
python setup.py bdist_wheel
8+
python3 setup.py bdist_wheel
9+
twine upload dist/*

setup.cfg

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
[wheel]
2+
universal = 1
3+
4+
[pycodestyle]
5+
select =
6+
max_line_length = 120
7+
8+
[pep8]
9+
max_line_length = 120
10+

setup.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,12 @@
55

66
setup(
77
name='tableaudocumentapi',
8-
version='0.0.1',
8+
version='0.2',
99
author='Tableau Software',
1010
author_email='github@tableau.com',
1111
url='https://github.com/tableau/document-api-python',
1212
packages=['tableaudocumentapi'],
1313
license='MIT',
14-
description='A Python module for working with Tableau files.'
14+
description='A Python module for working with Tableau files.',
15+
test_suite='test'
1516
)

tableaudocumentapi/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1+
from .field import Field
12
from .connection import Connection
23
from .datasource import Datasource, ConnectionParser
34
from .workbook import Workbook
5+
46
__version__ = '0.0.1'
57
__VERSION__ = __version__

tableaudocumentapi/datasource.py

Lines changed: 89 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,67 @@
33
# Datasource - A class for writing datasources to Tableau files
44
#
55
###############################################################################
6+
import collections
7+
import itertools
68
import xml.etree.ElementTree as ET
7-
from tableaudocumentapi import Connection
9+
import xml.sax.saxutils as sax
810

11+
from tableaudocumentapi import Connection, xfile
12+
from tableaudocumentapi import Field
13+
from tableaudocumentapi.multilookup_dict import MultiLookupDict
14+
from tableaudocumentapi.xfile import xml_open
915

10-
class ConnectionParser(object):
16+
########
17+
# This is needed in order to determine if something is a string or not. It is necessary because
18+
# of differences between python2 (basestring) and python3 (str). If python2 support is every
19+
# dropped, remove this and change the basestring references below to str
20+
try:
21+
basestring
22+
except NameError:
23+
basestring = str
24+
########
25+
26+
_ColumnObjectReturnTuple = collections.namedtuple('_ColumnObjectReturnTupleType', ['id', 'object'])
27+
28+
29+
def _get_metadata_xml_for_field(root_xml, field_name):
30+
if "'" in field_name:
31+
field_name = sax.escape(field_name, {"'": "&apos;"})
32+
xpath = ".//metadata-record[@class='column'][local-name='{}']".format(field_name)
33+
return root_xml.find(xpath)
34+
35+
36+
def _is_used_by_worksheet(names, field):
37+
return any((y for y in names if y in field.worksheets))
38+
39+
40+
class FieldDictionary(MultiLookupDict):
41+
def used_by_sheet(self, name):
42+
# If we pass in a string, no need to get complicated, just check to see if name is in
43+
# the field's list of worksheets
44+
if isinstance(name, basestring):
45+
return [x for x in self.values() if name in x.worksheets]
46+
47+
# if we pass in a list, we need to check to see if any of the names in the list are in
48+
# the field's list of worksheets
49+
return [x for x in self.values() if _is_used_by_worksheet(name, x)]
50+
51+
52+
def _column_object_from_column_xml(root_xml, column_xml):
53+
field_object = Field.from_column_xml(column_xml)
54+
local_name = field_object.id
55+
metadata_record = _get_metadata_xml_for_field(root_xml, local_name)
56+
if metadata_record is not None:
57+
field_object.apply_metadata(metadata_record)
58+
return _ColumnObjectReturnTuple(field_object.id, field_object)
1159

60+
61+
def _column_object_from_metadata_xml(metadata_xml):
62+
field_object = Field.from_metadata_xml(metadata_xml)
63+
return _ColumnObjectReturnTuple(field_object.id, field_object)
64+
65+
66+
class ConnectionParser(object):
1267
def __init__(self, datasource_xml, version):
1368
self._dsxml = datasource_xml
1469
self._dsversion = version
@@ -52,11 +107,13 @@ def __init__(self, dsxml, filename=None):
52107
self._connection_parser = ConnectionParser(
53108
self._datasourceXML, version=self._version)
54109
self._connections = self._connection_parser.get_connections()
110+
self._fields = None
55111

56112
@classmethod
57113
def from_file(cls, filename):
58-
"Initialize datasource from file (.tds)"
59-
dsxml = ET.parse(filename).getroot()
114+
"""Initialize datasource from file (.tds)"""
115+
116+
dsxml = xml_open(filename).getroot()
60117
return cls(dsxml, filename)
61118

62119
def save(self):
@@ -72,7 +129,8 @@ def save(self):
72129
"""
73130

74131
# save the file
75-
self._datasourceTree.write(self._filename, encoding="utf-8", xml_declaration=True)
132+
133+
xfile._save_file(self._filename, self._datasourceTree)
76134

77135
def save_as(self, new_filename):
78136
"""
@@ -85,7 +143,7 @@ def save_as(self, new_filename):
85143
Nothing.
86144
87145
"""
88-
self._datasourceTree.write(new_filename, encoding="utf-8", xml_declaration=True)
146+
xfile._save_file(self._filename, self._datasourceTree, new_filename)
89147

90148
###########
91149
# name
@@ -107,3 +165,28 @@ def version(self):
107165
@property
108166
def connections(self):
109167
return self._connections
168+
169+
###########
170+
# fields
171+
###########
172+
@property
173+
def fields(self):
174+
if not self._fields:
175+
self._fields = self._get_all_fields()
176+
return self._fields
177+
178+
def _get_all_fields(self):
179+
column_field_objects = self._get_column_objects()
180+
existing_column_fields = [x.id for x in column_field_objects]
181+
metadata_only_field_objects = (x for x in self._get_metadata_objects() if x.id not in existing_column_fields)
182+
field_objects = itertools.chain(column_field_objects, metadata_only_field_objects)
183+
184+
return FieldDictionary({k: v for k, v in field_objects})
185+
186+
def _get_metadata_objects(self):
187+
return (_column_object_from_metadata_xml(x)
188+
for x in self._datasourceTree.findall(".//metadata-record[@class='column']"))
189+
190+
def _get_column_objects(self):
191+
return [_column_object_from_column_xml(self._datasourceTree, xml)
192+
for xml in self._datasourceTree.findall('.//column')]

0 commit comments

Comments
 (0)