Skip to content

songyesog2000/A-Text-Based-Network

Repository files navigation

A-Text-Based-Network

This repo is a Python cover of the R-project, A Text-Based Network, by Majeed Simaan. The original R project refers to Rpubs: https://rpubs.com/simaan84/410145

Summary

Basics

This program is recommend to run with

  • Python version 3.X.
  • BeautifulSoup4
  • nltk
  • pyvis
  • numpy
  • textdistance
  • pandas

Install by downloading the file from the github page, or using git code Install via GitHub:

git clone https://github.com/songyesog2000/A-Text-Based-Network.git

Company Profiles

The profiles of listing companies from Yahoo Finance, collected via Web mining. The python program for that executes as

python profiles_list.py --tickers [string of ticker symbols separated by ',']

if the --tickers [ticker symbol] is not provided, the program is default to collect profiles of 'JPM', 'BAC', 'GOOG', 'AAPL', 'MMM', 'AAC', 'T', 'VZ', 'XOM', 'CVX', 'KO', 'BUD'. The collected profiles are stored in the json file profiles_list.json.

###Text-based Network The distance of the companies is defined by the Jaro-winkler distance of their profiles. Then, transfer the distance to truncated similarity values ( truncated by 0.25 in example ). example_W_similarity Based on the similarity, a network graph is composed and stored in G.html, the html file will show up the visualization in browsers. network graph

All the above process is integrated by running

python Text-based\ Network.py

About

A python cover of the R-project, A Text-Based Network, by Majeed Simaan. Rpubs: https://rpubs.com/simaan84/410145

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published