enigma1

Ethereum Copyright Storage and Web Crawler

This project combines Ethereum-based copyright storage with a web crawler that utilizes Puppeteer, regex, depth-first search (DFS), and optical character recognition (OCR) to identify and flag websites that infringe upon copyrights. This Readme provides an overview of the project, installation instructions, usage guidelines, and other relevant information.

Project Overview

The goal of this project is to create a decentralized copyright storage system using the Ethereum blockchain. It allows users to store copyright information securely and immutably. Additionally, the project includes a web crawler that scans websites for potential copyright infringement using various techniques such as Puppeteer, regex pattern matching, DFS, and OCR.

The main components of the project are as follows:

Ethereum Contract: Contains the smart contract code that enables copyright storage on the Ethereum blockchain.
Web Crawler: Utilizes Puppeteer, a Node.js library, to crawl websites and analyze their content for potential copyright infringement. It uses regex pattern matching to identify copyright-related keywords and OCR to analyze images containing copyrighted content.
Flagging Mechanism: When the web crawler identifies potential copyright infringement, it flags the respective websites and generates a report for further analysis.

Installation

To install and set up the project, follow these steps:

Clone the project repository from GitHub: git clone <repository-url>
Navigate to the project directory: cd enigma1
Install the required dependencies: npm install
Configure the Ethereum network:

Connect to an Ethereum network or set up a local development network using tools like Ganache or Truffle.
Update the Ethereum network configuration in the project files as necessary.

Set up the web crawler environment:

Install Puppeteer:
```
npm install puppeteer
```
Install any additional dependencies required for OCR, such as Tesseract.js.

Configure the web crawler:

Update the crawler settings in the project files, such as the starting URLs, crawling depth, and OCR configuration.

Usage

To use the project, follow these steps:

Start the Ethereum network or connect to the desired network where the copyright storage contract is deployed.
Deploy the copyright storage smart contract to the Ethereum network:

Compile and deploy the contract using Truffle or your preferred Ethereum development framework.
Configure the contract address and ABI in the project files.

Launch the web crawler: node crawler.js
The web crawler will start scanning websites based on the configured settings.

The crawler will analyze the web pages using regex patterns to identify copyright-related keywords.
It will also perform OCR on images to check for potential copyrighted content.
When infringement is detected, the crawler will flag the website and generate a report.

Monitor the crawler output and review the generated reports for potential copyright infringement.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Demo_site		Demo_site
JSONwaleFiles		JSONwaleFiles
OCR		OCR
React Front End		React Front End
contracts		contracts
onlineVideoDownloader		onlineVideoDownloader
test		test
web3-storage-quickstart		web3-storage-quickstart
webPage		webPage
.gitignore		.gitignore
README.md		README.md
allFIles.txt		allFIles.txt
compile.js		compile.js
controller.js		controller.js
deploy.js		deploy.js
makeStorageClient.js		makeStorageClient.js
package-lock.json		package-lock.json
package.json		package.json
readData.js		readData.js
retrieve.cjs		retrieve.cjs
retrieved.json		retrieved.json
web3storageToken.txt		web3storageToken.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

enigma1

Ethereum Copyright Storage and Web Crawler

Table of Contents

Project Overview

Installation

Usage

Winning solution of SIH 2022

Problem Statement Code: NS1228

Team:

Aryaman Raj

Astitwa Dwivedi

Ishita Srivastava

Anmol Bansal

Ashwinee Kumar Samdarshi

Sarthak Saxena

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Aryamanraj/enigma

Folders and files

Latest commit

History

Repository files navigation

enigma1

Ethereum Copyright Storage and Web Crawler

Table of Contents

Project Overview

Installation

Usage

Winning solution of SIH 2022

Problem Statement Code: NS1228

Team:

Aryaman Raj

Astitwa Dwivedi

Ishita Srivastava

Anmol Bansal

Ashwinee Kumar Samdarshi

Sarthak Saxena

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages