Categorygithub.com/drgsn/filefusion
module
0.1.4
Repository: https://github.com/drgsn/filefusion.git
Documentation: pkg.go.dev

# README

FileFusion ๐Ÿš€

Test Coverage Release Go Report Card License: MPL 2.0

FileFusion is a powerful command-line tool designed to concatenate and process files in a format optimized for Large Language Models (LLMs).

Installation โ€ข Quick Start โ€ข Features โ€ข Documentation โ€ข Examples

๐Ÿ“š Table of Contents

โœจ Features

FileFusion streamlines your file processing workflow with:

Core Features

  • ๐Ÿ“ฆ Multiple Output Formats

    • Support for XML, JSON, and YAML
    • Preserved file metadata and structure
    • Configurable output formatting
  • ๐ŸŽฏ Smart Pattern Matching

  • โšก๏ธ High Performance

    • Concurrent file processing
    • Efficient memory usage
    • Automatic file splitting for large outputs

Processing Features

  • ๐Ÿ“Š Advanced Size Control

    • Individual file size limits
    • Total output size management
    • Automatic output splitting
    • Detailed size reporting
  • ๐Ÿงน Intelligent Code Cleaning

    • Multi-language support
    • Comment preservation options
    • Code structure optimization
    • Whitespace management
  • ๐Ÿ”’ Reliability & Safety

    • Atomic write operations
    • Thorough error checking
    • Dry run support
    • Symlink handling

๐Ÿš€ Quick Start

Get started with FileFusion in three simple steps:

  1. Install:

    curl -fsSL https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh | bash
    
  2. Process current directory:

    filefusion
    
  3. Process specific files:

    filefusion --pattern "*.{js,py}" --clean -o output.xml /path/to/project
    

๐Ÿš€ Installation

Quick Install (Recommended)

Using curl:

curl -fsSL https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh | bash

Using wget:

wget -qO- https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh | bash

Safe Install (Recommended Security Practice)

# Download and inspect the script first
curl -fsSL https://raw.githubusercontent.com/drgsn/filefusion/main/install.sh > install.sh
chmod +x install.sh
./install.sh

Alternative Methods

Using Go:

go install github.com/drgsn/filefusion/cmd/filefusion@latest

Or download the latest binary for your platform from the releases page.

๐Ÿ—‘๏ธ Uninstallation

To uninstall FileFusion:

  1. Remove the installation directory:
rm -rf ~/.filefusion
  1. Remove FileFusion from your shell configuration file. Depending on your shell and OS, edit one of these files:
  • macOS Bash users: ~/.bash_profile
  • Linux Bash users: ~/.bashrc
  • Zsh users: ~/.zshrc
  • Fish users: ~/.config/fish/config.fish
  • Windows PowerShell users: $HOME/Documents/PowerShell/Microsoft.PowerShell_profile.ps1

Look for and remove these two lines:

# FileFusion
export PATH="$PATH:$HOME/.filefusion"

Go Installation

If you installed using Go:

go clean -i github.com/drgsn/filefusion/cmd/filefusion

๐Ÿ“‹ Configuration

Default Values

SettingDefault ValueDescription
Pattern*.go,*.json,*.yaml,*.ymlDefault file patterns to process
Max File Size10MBMaximum size for individual input files
Max Output Size50MBMaximum total size for all processed content
Max Output File30KBMaximum size per output file (auto-splits)
Output FormatXMLDefault output format when not specified
Exclude PatternnoneNo files excluded by default
Clean ModedisabledCode cleaning and optimization
Dry RundisabledPreview files to be processed

๐ŸŽฏ Basic Usage

Simple Commands

# Process current directory with defaults
filefusion

# Process specific directory
filefusion /path/to/project

# Process multiple directories
filefusion /path/to/project1 /path/to/project2

# Generate specific output format
filefusion -o output.json /path/to/project

๐Ÿ› ๏ธ Flag Examples

Output Path (-o, --output)

# Generate XML output
filefusion -o output.xml /path/to/project

# Generate JSON output
filefusion -o output.json /path/to/project

# Generate YAML output
filefusion -o output.yaml /path/to/project

Pattern Matching Rules

For detailed pattern matching examples and rules, please refer to our Pattern Guide.

Here are some common patterns:

PatternDescription
*.goAll Go files
*.{go,proto}All Go and Proto files
src/**/*.jsAll JavaScript files under src
!vendor/**Exclude vendor directory
**/*_test.goAll Go test files

File Patterns (-p, --pattern)

# Process only Python and JavaScript files
filefusion --pattern "*.py,*.js" /path/to/project

# Process all source files
filefusion -p "*.go,*.rs,*.js,*.py,*.java" /path/to/project

# Include configuration files
filefusion -p "*.yaml,*.json,*.toml,*.ini" /path/to/project

Exclusions (-e, --exclude)

# Exclude test files
filefusion --exclude "*_test.go,test/**" /path/to/project

# Exclude build and vendor directories
filefusion -e "build/**,vendor/**,node_modules/**" /path/to/project

# Complex exclusion
filefusion -e "**/*.test.js,**/*tests*/**,**/dist/**" /path/to/project

Size Limits

# Increase individual file size limit to 20MB
filefusion --max-file-size 20MB /path/to/project

# Increase total output size limit to 100MB
filefusion --max-output-size 100MB /path/to/project

# Set maximum size per output file to 50KB (splits into multiple files if exceeded)
filefusion --max-output-file-size 50KB /path/to/project

# Set all size limits and enable cleaning
filefusion --max-file-size 20MB --max-output-size 100MB --max-output-file-size 50KB --clean /path/to/project

Size limits accept suffixes: B, KB, MB, GB, TB

When the processed content exceeds max-output-file-size, FileFusion automatically splits the output into multiple files with sequential numbering (e.g., output.1.xml, output.2.xml, output.3.xml).

๐Ÿ“š Code Cleaning

FileFusion includes a powerful code cleaning engine that optimizes files for LLM processing while preserving functionality. The cleaner supports multiple programming languages and offers various optimization options.

Supported Languages

  • Go, Java, Python, Swift, Kotlin
  • JavaScript, TypeScript, HTML, CSS
  • C++, C#, PHP, Ruby
  • SQL, Bash

Cleaning Options

OptionDescriptionDefault
--cleanEnable code cleaningfalse
--clean-remove-commentsRemove all commentstrue
--clean-preserve-doc-commentsKeep documentation commentstrue
--clean-remove-importsRemove import statementsfalse
--clean-remove-loggingRemove logging statementstrue
--clean-remove-getters-settersRemove getter/setter methodstrue
--clean-optimize-whitespaceOptimize whitespacetrue

Cleaning Examples

# Basic cleaning with default options
filefusion --clean input.go -o clean.xml

# Preserve all comments
filefusion --clean --clean-remove-comments=false input.py -o clean.xml

# Remove everything except essential code
filefusion --clean \
  --clean-remove-comments \
  --clean-preserve-doc-comments=false \
  --clean-remove-logging \
  --clean-remove-getters-setters \
  input.java -o clean.xml

# Clean TypeScript while preserving docs
filefusion --clean \
  --clean-preserve-doc-comments \
  --clean-remove-logging \
  --pattern "*.ts" \
  src/ -o clean.xml

# Clean multiple languages in a project
filefusion --clean \
  --pattern "*.{go,js,py}" \
  --clean-preserve-doc-comments \
  --clean-remove-logging \
  project/ -o clean.xml

Language-Specific Features

The cleaner automatically detects and handles language-specific patterns:

  • Logging Statements: Recognizes common logging patterns

    • Go: log., logger.
    • JavaScript/TypeScript: console., logger.
    • Python: logging., logger., print
    • Java: Logger., System.out., System.err.
    • And more...
  • Documentation: Preserves language-specific doc formats

    • Go: // and /* */ doc comments
    • Python: Docstrings
    • JavaScript/TypeScript: JSDoc
    • Java: Javadoc
  • Code Structure: Maintains language idioms while removing noise

    • Preserves package/module structure
    • Keeps essential imports
    • Removes debug/test code

๐Ÿ“š Advanced Examples

Processing a Go Project

filefusion \
  --pattern "*.go" \
  --exclude "*_test.go,vendor/**" \
  --output project.json \
  --max-file-size 5MB \
  /path/to/go/project

Processing Web Project Files

filefusion \
  --pattern "*.js,*.ts,*.jsx,*.tsx,*.css,*.html" \
  --exclude "node_modules/**,dist/**,build/**" \
  --output web-project.xml \
  /path/to/web/project

Code Cleaning and Size Optimization

# Clean and optimize a Go project
filefusion \
  --pattern "*.go" \
  --exclude "*_test.go" \
  --clean \
  --clean-remove-comments \
  --clean-remove-logging \
  --output optimized.xml \
  /path/to/go/project

# Clean TypeScript/JavaScript with preserved documentation
filefusion \
  --pattern "*.ts,*.js" \
  --clean \
  --clean-preserve-doc-comments \
  --clean-remove-logging \
  --clean-optimize-whitespace \
  --output web-optimized.xml \
  /path/to/web/project

๐Ÿ“„ Output Format Examples

XML Output

<?xml version="1.0" encoding="UTF-8"?>
<documents>
  <document index="1">
    <source>main.go</source>
    <document_content>
      package main
      ...
    </document_content>
  </document>
</documents>

JSON Output

{
    "documents": [
        {
            "index": 1,
            "source": "main.go",
            "document_content": "package main\n..."
        }
    ]
}

YAML Output

documents:
    - index: 1
      source: main.go
      document_content: |
          package main
          ...

๐Ÿ’ก Tips and Best Practices

  1. Start with Dry Run

    filefusion --dry-run /path/to/project
    

    This shows which files will be processed without making changes.

  2. Optimize for Large Projects

    filefusion --max-output-file-size 1MB --clean /path/to/project
    

    Use larger output file sizes and cleaning for better LLM processing.

  3. Handle Large Codebases

    filefusion --pattern "*.{go,js}" --exclude "test/**,vendor/**" /path/to/project
    

    Use specific patterns and exclusions to focus on relevant files.

โ— Issues and Solutions

"no files found matching pattern"

  • Check if patterns match your file extensions
  • Verify files exist in the specified directory
  • Make sure patterns don't conflict with exclusions

"output size exceeds maximum"

  • Increase --max-output-size
  • Use more specific patterns
  • Split processing into multiple runs

"error processing files"

  • Check file permissions
  • Verify file encodings (UTF-8 recommended)
  • Ensure sufficient disk space

๐Ÿ“œ License

Mozilla Public License Version 2.0


Made with โค๏ธ by the DrGos

# Packages

No description provided by the author