Skip to content

Commit 885430d

Browse files
committed
Merge branch 'release-3.4.0'
2 parents 1afaacc + f2b58d4 commit 885430d

File tree

104 files changed

+54222
-4883
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+54222
-4883
lines changed

CONTRIBUTING.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,9 @@ Also, please check the [Gensim FAQ](https://github.com/RaRe-Technologies/gensim/
1212
2. Clone your fork: `git clone https://github.com/<YOUR_GITHUB_USERNAME>/gensim.git`
1313
3. Create a new branch based on `develop`: `git checkout -b my-feature develop`
1414
4. Setup your Python enviroment
15-
- Create a new [virtual environment](https://virtualenv.pypa.io/en/stable/): `pip install virtualenv; virtualenv gensim_env; source gensim_env/bin/activate`
15+
- Create a new [virtual environment](https://virtualenv.pypa.io/en/stable/): `pip install virtualenv; virtualenv gensim_env` and activate it:
16+
- For linux: `source gensim_env/bin/activate`
17+
- For windows: `gensim_env\Scripts\activate`
1618
- Install Gensim and its test dependencies in [editable mode](https://pip.pypa.io/en/stable/reference/pip_install/#editable-installs):
1719
- For linux: `pip install -e .[test]`
1820
- For windows: `pip install -e .[test-win]`

MANIFEST.in

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,9 @@ include gensim/models/doc2vec_inner.c
1212
include gensim/models/doc2vec_inner.pyx
1313
include gensim/models/fasttext_inner.c
1414
include gensim/models/fasttext_inner.pyx
15+
include gensim/models/_utils_any2vec.c
16+
include gensim/models/_utils_any2vec.pyx
17+
include gensim/corpora/_mmreader.c
18+
include gensim/corpora/_mmreader.pyx
19+
include gensim/_matutils.c
20+
include gensim/_matutils.pyx

README.md

Lines changed: 20 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ gensim – Topic Modelling in Python
33

44
[![Build Status](https://travis-ci.org/RaRe-Technologies/gensim.svg?branch=develop)](https://travis-ci.org/RaRe-Technologies/gensim)
55
[![GitHub release](https://img.shields.io/github/release/rare-technologies/gensim.svg?maxAge=3600)](https://github.com/RaRe-Technologies/gensim/releases)
6+
[![Conda-forge Build](https://anaconda.org/conda-forge/gensim/badges/version.svg)](https://anaconda.org/conda-forge/gensim)
67
[![Wheel](https://img.shields.io/pypi/wheel/gensim.svg)](https://pypi.python.org/pypi/gensim)
78
[![DOI](https://zenodo.org/badge/DOI/10.13140/2.1.2393.1847.svg)](https://doi.org/10.13140/2.1.2393.1847)
89
[![Mailing List](https://img.shields.io/badge/-Mailing%20List-brightgreen.svg)](https://groups.google.com/forum/#!forum/gensim)
@@ -122,25 +123,25 @@ Adopters
122123

123124
| Name | Logo | URL | Description |
124125
|----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
125-
| RaRe Technologies | <img src="http://rare-technologies.com/wp-content/uploads/2016/02/rare_image_only.png" width="100"> | [rare-technologies.com](http://rare-technologies.com) | Machine learning & NLP consulting and training. Creators and maintainers of Gensim. |
126-
| Mindseye | <img src="http://www.mindseyesolutions.com/wp-content/uploads/2015/12/Mindseye_logo_website.jpg" width="100"> | [mindseye.com](http://www.mindseyesolutions.com/) | Similarities in legal documents |
127-
| Talentpair | ![Talentpair](https://avatars3.githubusercontent.com/u/8418395?v=3&s=100) | [talentpair.com](http://talentpair.com) | Data science driving high-touch recruiting |
128-
| Tailwind | <img src="http://blog.tailwindapp.com/wp-content/uploads/2013/10/Tailwind-Square-Logo-Blue-White-300x300.png" width="100"> | [Tailwindapp.com](https://www.tailwindapp.com/)| Post interesting and relevant content to Pinterest |
129-
| Issuu | <img src="https://static.isu.pub/fe/issuu-brandpages/s3/155/press/assets/brand_package_zip/issuu%20logos/png/issuu-logo-stacked-colour.png" width="100"> | [Issuu.com](https://issuu.com/)| Gensim’s LDA module lies at the very core of the analysis we perform on each uploaded publication to figure out what it’s all about.
130-
| Sports Authority | <img src="https://upload.wikimedia.org/wikipedia/commons/6/6c/Sports_Authority_logo2011.jpg" width="100"> | [sportsauthority.com](https://en.wikipedia.org/wiki/Sports_Authority)| Text mining of customer surveys and social media sources |
131-
| Search Metrics | <img src="http://www.searchmetrics.com/wp-content/uploads/Logo_searchmetrics_Webversion.png" width="100"> | [searchmetrics.com](http://www.searchmetrics.com/)| Gensim word2vec used for entity disambiguation in Search Engine Optimisation
132-
| Cisco Security | <img src="https://supportforums.cisco.com/sites/default/files/legacy/1/6/1/2161-CiscoSystems.gif" width="100"> | [cisco.com](http://www.cisco.com/c/en/us/products/security/index.html)| Large-scale fraud detection
133-
| 12K Research | <img src="https://static1.squarespace.com/static/548d6f40e4b0fb61d7b8f40b/t/57310800b09f95e472ba5dd1/1462831123953/12k-logo.png" width="100"> | [12k.co](https://12k.co/)| Document similarity analysis on media articles
134-
| National Institutes of Health | <img src="https://www.nih.gov/sites/default/files/styles/featured_media_breakpoint-large/public/about-nih/2012-logo.png" width="100"> | [github/NIHOPA](https://github.com/NIHOPA/pipeline_word2vec)| Processing grants and publications with word2vec
135-
| Codeq LLC | <img src="https://codeq.com/wp-content/themes/codeq/assets/img/logo.svg" width="100"> | [codeq.com](https://codeq.com)| Document classification with word2vec
136-
| Mass Cognition | <img src="http://static1.squarespace.com/static/5637b16ee4b050255657c537/t/56a683bf9cadb6bf86a0ea13/1461016648294/?format=1500w" width="100"> | [masscognition.com](http://www.masscognition.com/) | Topic analysis service for consumer text data and general text data |
137-
| Stillwater Supercomputing | <img src="http://www.stillwater-sc.com/img/stillwater-logo.png" width="100"> | [stillwater-sc.com](http://www.stillwater-sc.com/) | Document comprehension and association with word2vec |
138-
| Channel 4 | <img src="http://www.channel4.com/static/info/images/lib/c4logo_2015_info_corporate.jpg" width="100"> | [channel4.com](http://www.channel4.com/) | Recommendation engine |
139-
| Amazon | <img src="http://g-ec2.images-amazon.com/images/G/01/social/api-share/amazon_logo_500500._V323939215_.png" width="100"> | [amazon.com](http://www.amazon.com/) | Document similarity|
140-
| SiteGround Hosting | <img src="https://www.siteground.com/img/knox/logos/siteground.png" width="100"> | [siteground.com](https://www.siteground.com/) | An ensemble search engine which uses different embeddings models and similarities, including word2vec, WMD, and LDA. |
141-
| Juju | <img src="https://d5k1a84rm5hwo.cloudfront.net/img/juju_home_logo.png" width="100"> | [www.juju.com](http://www.juju.com/) | Provide non-obvious related job suggestions. |
142-
| NLPub | <img src="https://nlpub.org/images/thumb/a/aa/NLPub.svg/240px-NLPub.svg.png" width="100"> | [nlpub.org](https://nlpub.org/) | Distributional semantic models including word2vec. |
143-
|Capital One | <img src="https://s3.amazonaws.com/fjds/member/original/1245173/C1_Core_NG_RGB_R_%281%29.PNG?1456169388" width="200"> | [www.capitalone.com](https://www.capitalone.com/) | Topic modeling for customer complaints exploration. |
126+
| RaRe Technologies | ![rare](docs/src/readme_images/rare.png) | [rare-technologies.com](http://rare-technologies.com) | Machine learning & NLP consulting and training. Creators and maintainers of Gensim. |
127+
| Mindseye | ![mindseye](docs/src/readme_images/mindseye.png) | [mindseye.com](http://www.mindseyesolutions.com/) | Similarities in legal documents |
128+
| Talentpair | ![talent-pair](docs/src/readme_images/talent-pair.png) | [talentpair.com](http://talentpair.com) | Data science driving high-touch recruiting |
129+
| Tailwind | ![tailwind](docs/src/readme_images/tailwind.png)| [Tailwindapp.com](https://www.tailwindapp.com/)| Post interesting and relevant content to Pinterest |
130+
| Issuu | ![issuu](docs/src/readme_images/issuu.png) | [Issuu.com](https://issuu.com/)| Gensim’s LDA module lies at the very core of the analysis we perform on each uploaded publication to figure out what it’s all about.
131+
| Sports Authority | ![sports-authority](docs/src/readme_images/sports-authority.png) | [sportsauthority.com](https://en.wikipedia.org/wiki/Sports_Authority)| Text mining of customer surveys and social media sources |
132+
| Search Metrics | ![search-metrics](docs/src/readme_images/search-metrics.png) | [searchmetrics.com](http://www.searchmetrics.com/)| Gensim word2vec used for entity disambiguation in Search Engine Optimisation
133+
| Cisco Security | ![cisco](docs/src/readme_images/cisco.png) | [cisco.com](http://www.cisco.com/c/en/us/products/security/index.html)| Large-scale fraud detection
134+
| 12K Research | ![12k](docs/src/readme_images/12k.png)| [12k.co](https://12k.co/)| Document similarity analysis on media articles
135+
| National Institutes of Health | ![nih](docs/src/readme_images/nih.png) | [github/NIHOPA](https://github.com/NIHOPA/pipeline_word2vec)| Processing grants and publications with word2vec
136+
| Codeq LLC | ![codeq](docs/src/readme_images/codeq.png) | [codeq.com](https://codeq.com)| Document classification with word2vec
137+
| Mass Cognition | ![mass-cognition](docs/src/readme_images/mass-cognition.png) | [masscognition.com](http://www.masscognition.com/) | Topic analysis service for consumer text data and general text data |
138+
| Stillwater Supercomputing | ![stillwater](docs/src/readme_images/stillwater.png) | [stillwater-sc.com](http://www.stillwater-sc.com/) | Document comprehension and association with word2vec |
139+
| Channel 4 | ![channel4](docs/src/readme_images/channel4.png) | [channel4.com](http://www.channel4.com/) | Recommendation engine |
140+
| Amazon | ![amazon](docs/src/readme_images/amazon.png) | [amazon.com](http://www.amazon.com/) | Document similarity|
141+
| SiteGround Hosting | ![siteground](docs/src/readme_images/siteground.png) | [siteground.com](https://www.siteground.com/) | An ensemble search engine which uses different embeddings models and similarities, including word2vec, WMD, and LDA. |
142+
| Juju | ![juju](docs/src/readme_images/juju.png) | [www.juju.com](http://www.juju.com/) | Provide non-obvious related job suggestions. |
143+
| NLPub | ![nlpub](docs/src/readme_images/nlpub.png) | [nlpub.org](https://nlpub.org/) | Distributional semantic models including word2vec. |
144+
|Capital One | ![capitalone](docs/src/readme_images/capitalone.png) | [www.capitalone.com](https://www.capitalone.com/) | Topic modeling for customer complaints exploration. |
144145

145146
-------
146147

0 commit comments

Comments
 (0)