# Packages
# README
Web Scraping Project
This project is a web scraping application built using the Go programming language and the Gin framework. It leverages the https://github.com/gocolly/colly package to scrape web content.
Table of Contents
Installation
-
Clone the repository:
git clone [email protected]:jobayer12/ScrapifyGo.git
-
Change to the project directory:
cd ScrapifyGo
-
Install the required dependencies:
go mod tidy
-
Run the application:
go run cmd/api/main.go
or
make run
Usage
Once the application is running, you can access the API endpoints to perform various scraping tasks. The base URL for the API is http://localhost:8080.
API Endpoints
1. Scrape Sitemap Data
-
Endpoint: /api/v1/sitemap?url=https://example.com/sitemap.xml
-
Method: GET
-
Description: Scrapes the sitemap data from a given URL.
-
Response:
{ "data": [ { "changefreq": "string", "lastmod": "string", "loc": "string", "priority": "string" } ], "error": "string", "status": 0 }
2. Scrape URL from Website
-
Endpoint: /api/v1/url?url=https://example.com
-
Method: GET
-
Description: Scrapes URLs from a given web page.
-
Response:
{ "data": [ "string" ], "error": "string", "status": 0 }
3. Scrape Emails from URLs
-
Endpoint: /api/v1/email?url=https://example.com
-
Method: GET
-
Description: Scrapes email addresses from a list of URLs.
-
Response:
{ "data": [ "string" ], "error": "string", "status": 0 }
Contributing
Contributions are welcome! Please submit a pull request or open an issue to discuss any changes.