Categorygithub.com/editorpost/article
modulepackage
0.0.2
Repository: https://github.com/editorpost/article.git
Documentation: pkg.go.dev

# README

Summary for Busy Developers

  • Create an Article: Use article.NewArticle() to initialize a new article.
  • Minimal Invariant: Ensure fields ID, Title, Content, TextContent, and PublishDate are provided.
  • Normalize and Validate: Call article.Normalize() to trim and validate all fields.
  • Field Limits: Text fields are trimmed to specific lengths, and URLs are validated with a max length of 4096 characters.
  • Recommended Practice: Always use the constructor article.NewArticle() to ensure the structure is close to its minimal invariant.

Example Code

package main

import (
    "github.com/editorpost/spider/extract/article"
    "time"
)

func main() {
    // Create a new article
    art := article.NewArticle()

    // Set required fields
    art.Title = "Sample Title"
    art.Content = "This is the content of the article."
    art.TextContent = "This is the text content of the article."
    art.PublishDate = time.Now()

    // Normalize and validate the article
    art.Normalize()
}

And Full example of JSON output or Article with nested structures:


 
 {
 	"id": "string (required, uuid4, max=36)",
 	"title": "string (required, max=255)",
 	"summary": "string (max=255)",
 	"markup": "string (required, max=65000)",
 	"text": "string (required, max=65000)",
 	"genre": "string (max=500)",
 	"source_url": "string (omitempty, url, max=4096)",
 	"language": "string (max=255)",
 	"category": "string (max=255)",
 	"source_name": "string (max=255)",
 	"published": "time.Time (required)",
 	"modified": "time.Time",
 	"images": [
 		{
 			"id": "string (required, uuid4, max=36)",
 			"url": "string (required, url, max=4096)",
 			"title": "string (max=500)",
 			"alt": "string (required, max=255)",
 			"width": "int",
 			"height": "int"
 		}
 	],
 	"videos": [
 		{
 			"id": "string (required, uuid4, max=36)",
 			"url": "string (required, url, max=4096)",
 			"embed": "string (max=65000)",
 			"title": "string (max=500)"
 		}
 	],
 	"quotes": [
 		{
 			"id": "string (required, uuid4, max=36)",
 			"text": "string (required, max=65000)",
 			"author": "string (max=255)",
 			"source_url": "string (required, url, max=4096)",
 			"platform": "string (max=255)"
 		}
 	],
 	"tags": [
 		"string"
 	],
 	"socials": [
 		{
 			"id": "string (required, uuid4, max=36)",
 			"platform": "string (max=255)",
 			"url": "string (max=4096)"
 		}
 	]
 }

This documentation provides a comprehensive guide to using the article package, covering architecture, usage, and validation limits. By following these guidelines, developers can ensure that their articles are well-structured and validated.

Article Package Documentation

Overview

The article package provides a structured way to handle and validate article data, ensuring consistency and integrity. The package is designed to normalize and validate various types of content associated with an article, including images, videos, quotes, and social media profiles.

Contents

Architecture

The article package is built around the Article struct, which includes various fields to store article metadata and content. Each nested structure (Image, Video, Quote, and SocialProfile) has its own validation and normalization logic to ensure data integrity.

Validation Limits

The package enforces several validation limits to ensure data consistency and prevent overflow attacks:

  • URL Fields: Maximum length of 4096 characters.
  • Text Fields: Trimmed and limited to specific lengths (e.g., title: 255 characters, title: 500 characters).
  • Author Name: Limited to 255 characters.
  • Language Code: Must be a valid ISO 639-1 code (2 characters).
  • Content: Text content fields are limited to 65000 characters.

These limits ensure that the data remains manageable and secure, suitable for database storage and processing.

Normalization Approach

Normalization in the article package involves:

  1. Trimming leading and trailing whitespace from all text fields.
  2. Trimming text fields to their maximum allowed lengths.
  3. Validating URLs and setting invalid fields to their zero values.
  4. Logging validation errors without throwing exceptions, ensuring robustness.

Usage

Creating an Article

To create an article, use the NewArticle constructor to initialize a new Article struct with default values. This ensures the structure is close to its minimal invariant.

article := article.NewArticle()

Minimal Invariant

The minimal invariant for an Article includes the following required fields:

  • ID
  • Title
  • Content
  • TextContent
  • PublishDate

Here is an example of a minimal invariant in JSON format:

{
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "title": "Sample Title",
  "markup": "This is the content of the article.",
  "text": "This is the text content of the article.",
  "published": "2024-01-01T00:00:00Z"
}

Normalization and Validation

To normalize and validate an article, call the Normalize method on the Article struct. This method trims and validates all fields, logging any validation errors and clearing invalid fields.

article.Normalize()

Fields

Article

The Article struct includes the following fields:

  • ID: UUID of the article (required, max length: 36).
  • Title: Title of the article (required, max length: 255).
  • Byline: Author(s) of the article (optional, max length: 255).
  • Content: Full content of the article (required, max length: 65000).
  • TextContent: Text content of the article (required, max length: 65000).
  • Excerpt: Short excerpt of the article (optional, max length: 500).
  • PublishDate: Publication date of the article (required).
  • ModifiedDate: Last modification date of the article (optional).
  • Images: List of images associated with the article.
  • Videos: List of videos associated with the article.
  • Quotes: List of quotes associated with the article.
  • Tags: List of tags associated with the article.
  • Source: Source URL of the article (optional, max length: 4096).
  • Language: Language code of the article (required, max length: 2).
  • Category: Category of the article (optional, max length: 255).
  • SiteName: Site name where the article is published (optional, max length: 255).
  • AuthorSocialProfiles: List of social media profiles of the authors.

Image

The Image struct includes the following fields:

  • URL: URL of the image (required, max length: 4096).
  • AltText: Alternative text for the image (optional, max length: 255).
  • Width: Width of the image in pixels (optional, min: 0).
  • Height: Height of the image in pixels (optional, min: 0).
  • Caption: Caption for the image (optional, max length: 500).

Video

The Video struct includes the following fields:

  • URL: URL of the video (required, max length: 4096).
  • EmbedCode: Embed code for the video (optional, max length: 65000).
  • Caption: Caption for the video (optional, max length: 500).

Quote

The Quote struct includes the following fields:

  • Text: Text of the quote (required).
  • Author: Author of the quote (optional, max length: 255).
  • Source: Source URL of the quote (optional, max length: 4096).
  • Platform: Platform where the quote was found (optional, max length: 255).

SocialProfile

The SocialProfile struct includes the following fields:

  • Platform: Platform name (required, max length: 255).
  • URL: URL of the social profile (required, max length: 4096).

# Functions

GetStringSlice safely extracts a slice of strings from the map or returns a zero value.
IntFromMap safely extracts an int from the map or returns a zero value.
NewArticle creates a new Article with the provided data and returns a pointer to the Article.
NewArticleFromMap creates an Article from a map[string]any, validates it, and returns a pointer to the Article or an error.
No description provided by the author
NewImage creates a new Image with a random UUID.
NewImageFromMap creates an Image from a map[string]any, validates it, and returns a pointer to the Image or an error.
NewImages creates a collection, skips invalid items, and logs errors.
NewImagesStrict creates a collection and validates every image.
NewMedia creates a new Media with a random UUID.
NewMediaFromMap creates a Media from a map[string]any, validates it, and returns a pointer to the Media or an error.
NewMedias creates a collection, skips invalid items, and logs errors.
NewMediasStrict creates a collection and validates every media.
NewPerson creates a new Person with a random ID.
NewPersons creates a new Persons collection.
goland:noinspection GoUnusedExportedFunction.
NewQuoteFromMap creates a Quote from a map[string]any, validates it, and returns a pointer to the Quote or an error.
NewQuotes creates a collection, skips invalid items, and logs errors.
NewQuotesStrict creates a collection and validates every quote.
goland:noinspection GoUnusedExportedFunction.
NewSocialProfileFromMap creates a Social from a map[string]any, validates it, and returns a pointer to the Social or an error.
NewSocials creates a collection, skips invalid social items, and logs errors.
NewSocialsStrict creates a collection and validates every social profile.
No description provided by the author
NewVideo creates a new Video with a random UUID.
NewVideoFromMap creates a Video from a map[string]any, validates it, and returns a pointer to the Video or an error.
NewVideos creates a collection, skips invalid videos, and logs errors.
NewVideosStrict creates a collection and validates every video.
StringFromMap safely extracts a string from the map or returns a zero value.
TrimToMaxLen trims the input string to the specified maximum length, ensuring that it doesn't exceed the length in runes.

# Structs

Article
Article represents a news article with various types of content.
No description provided by the author
Image represents an image in the article.
Images represents a collection of Image pointers.
Media represents a media in the article.
Medias represents a collection of Media pointers.
No description provided by the author
No description provided by the author
Quote represents a quote from social media in the article.
Quotes represents a collection of Quote pointers.
Social represents a social media links.
Socials represents a collection of Social pointers.
No description provided by the author
Video represents a video in the article.
Videos represents a collection of Video pointers.

# Type aliases

FilterFn is a function to filter articles.