# README
Web-Analyzer
This is a simple web application that takes a URL as input, scrapes the web page, and provides details such as the HTML version, title, headings, internal and external links, inaccessible links, login form presence, and accessibility of URLs.
Features
-
Scrapes a given URL for the following details:
- HTML version
- Page title
- Number of headings by level
- Internal and external links
- Inaccessible links
- Presence of a login form
- Accessibility of URLs
-
Displays results in a user-friendly format, including a table for URL accessibility.
-
Swagger documentation for the API.
Prerequisites
- Go (version 1.21+)
- Gin web framework
golang.org/x/net/html
package for HTML parsing- Swaggo for Swagger documentation
Installation
-
Clone the repository:
git clone https://github.com/venusaran/web-analyzer.git cd web-analyzer
-
Install dependencies:
go mod tidy
-
Install
swag
for generating Swagger documentation:go install github.com/swaggo/swag/cmd/swag@latest
-
Generate Swagger documentation:
swag init
Running the Application
-
Start the Go server:
go run cmd/main.go
-
Open your web browser and navigate to
http://localhost:8080
Usage
- Enter a URL in the input box.
- Click the "Submit" button.
- Wait for the spinner to disappear, indicating that the data has been fetched.
- View the results, including a table of accessible URLs.
Accessing Swagger Documentation
Once the server is running, you can access the Swagger documentation at: Swagger Docs
Test Cases
To execute the test cases, please follow thebelow steps
cd internal/service/scraper
go test
Project Structure
.
|
api/
└── rest/
| └── controller/
| └── analyzer/
| └── analyzer.go
router/
└── router.go
cmd/
└── main.go
docs/
└── docs.go
└── swagger.json
└── swagger.yaml
internal/
└── service/
| └── scraper/
| └── helper.go
| └── scraper.go
| └── helper_test.go
pkg/
└── constants/
| └── constants.go
└── interfaces/
| └── interfaces.go
util/
└── utility.go
static/
└── index.html
vendor/
.gitignore
go.mod
go.sum
README.md
Example
Here's a sample output you can expect from the application:
HTML Version: HTML 5
Title: Example Title
Headings: {"h1": 2, "h2": 3, "h3": 1}
Internal Links: 5
External Links: 10
Inaccessible Links: 2
Login Form Present: true
Accessible URLs:
https://example.com: Accessible
https://example.org: Inaccessible
Contributing
- Fork the repository.
- Create a new branch (
git checkout -b feature-branch
). - Make your changes.
- Commit your changes (
git commit -am 'Add new feature'
). - Push to the branch (
git push origin feature-branch
). - Create a new Pull Request.
License
This project is licensed under the MIT License.
Acknowledgements
- Gin Gonic - HTTP web framework for Go
- golang.org/x/net/html - HTML parsing library
- Swaggo - Automatically generate RESTful API documentation with Swagger 2.0 for Go.