A Go-based search engine application specifically designed for property news articles in Indonesia. This project implements advanced search algorithms including TF-IDF with Cosine and Jaccard similarity measures.
- Kelompok: 15
- Anggota: Muhammad Mahathir (2208107010056)
The search engine crawls and indexes articles from three major Indonesian property news portals:
-
Tren Nilai Properti
- Analisis pergerakan harga properti
- Faktor-faktor yang mempengaruhi nilai properti
- Prediksi dan forecast nilai properti
-
Berita Bursa Perumahan
- Perkembangan pasar properti
- Kebijakan dan regulasi properti
- Tren dan dinamika pasar perumahan
-
Listing Properti
- Informasi properti yang dijual/disewa
- Detail dan spesifikasi properti
- Perbandingan harga properti
-
Advanced Search Algorithms
- TF-IDF (Term Frequency-Inverse Document Frequency)
- Cosine Similarity
- Jaccard Similarity
-
Text Processing
- Stemming using Sastrawi (Indonesian language)
- Stopword removal
- Case folding
- Punctuation and number removal
-
Search Results
- Relevance-based ranking
- Content preview with query highlighting
- Pagination support
- Source website favicon display
- Backend: Go (Golang)
- Web Framework: Gin
- Template Engine: Go HTML Templates
- Text Processing: Go-Sastrawi
- Data Storage: JSON
.
├── articles.json # Indexed articles data
├── main.go # Main application entry point
├── search.go # Search engine implementation
├── static/ # Static assets (images, favicon)
└── templates/ # HTML templates
├── 404.html
├── document.html
└── index.html
└── results.html
-
Search Engine
- Inverted index construction
- TF-IDF score calculation
- Vector space model implementation
- Multiple similarity measures
-
Text Processing Pipeline
- Text cleaning and normalization
- Tokenization
- Stopword removal
- Stemming for Indonesian language
-
Web Interface
- Clean and responsive design
- Real-time search results
- Article preview with highlighted matches
- Pagination for large result sets
- Make sure you have Go installed on your system
- Clone the repository
- Install dependencies:
go mod download
- Run the application:
go run main.go
- Access the application at
http://localhost:8080
The application supports two search methods:
-
Cosine Similarity (Default)
- Measures the cosine of the angle between two vectors
- Better for documents of different lengths
- More precise for content similarity
-
Jaccard Similarity
- Measures similarity based on the intersection over union
- Good for quick similarity approximations
- Useful for comparing document sets
- Thread-safe search engine implementation
- Efficient inverted index structure
- Optimized content preview generation
- Fast search response times
- Memory-efficient data structures
This project is part of an academic assignment at Universitas Syiah Kuala.