Compare commits

..

13 Commits

Author SHA1 Message Date
antanst
bba00a9892 Add pprof server endpoint (optional, default off) 2025-10-16 15:06:27 +03:00
antanst
8e1297a230 . 2025-10-14 17:22:19 +03:00
antanst
d336bdffba . 2025-10-10 15:20:45 +03:00
antanst
3a5835fc42 . 2025-10-09 17:43:23 +03:00
antanst
2ead66f012 Refactor Gemini protocol implementation and improve server architecture
- Move gemini URL parsing from common/ to gemini/ package
- Add structured status codes in gemini/status_codes.go
- Improve error handling with proper Gemini status codes
- Update configuration field naming (Listen -> ListenAddr)
- Add UTF-8 validation for URLs
- Enhance security with better path validation
- Add CLAUDE.md for development guidance
- Include example content in srv/ directory
- Update build system to use standard shell
2025-06-06 15:02:25 +03:00
a426edb1f6 Use our own UID package. 2025-05-26 18:18:21 +03:00
2f231d4b12 Update Dockerfile 2025-05-26 16:51:01 +03:00
68dfd3cadd Update README.md 2025-05-26 16:50:26 +03:00
de320db166 Add go workspace. 2025-05-26 16:46:38 +03:00
7ea36d23dd Improve build system and Docker configuration
- Switch Dockerfile base image from golang:1.23-bookworm to debian:12-slim
- Update Dockerfile to use pre-built binary instead of building in container
- Fix Docker CMD to use new CLI flag format with --listen and --root-path
- Update Makefile to build binary to ./dist/ directory with CGO_ENABLED=0
- Make build-docker target depend on build target for efficiency
- Change clean target to remove ./dist directory instead of single binary
2025-05-26 13:29:33 +03:00
28008a320d Refactor error handling and logging system
- Replace custom errors package with xerrors for structured error handling
- Remove local logging wrapper and use git.antanst.com/antanst/logging
- Add proper error codes and user messages in server responses
- Improve connection handling with better error categorization
- Update certificate path to use local certs/ directory
- Add request size validation (1024 byte limit)
- Remove panic-on-error configuration option
- Enhance error logging with connection IDs and remote addresses
2025-05-26 13:28:16 +03:00
c78d7898f9 Update Go module dependencies and version
- Upgrade Go version from 1.23.4 to 1.24.3
- Replace zerolog dependency with local xerrors and logging modules
- Add local module replacements for git.antanst.com/antanst/xerrors and logging
- Remove unused color and system dependencies
- Keep gabriel-vasile/mimetype and matoous/go-nanoid dependencies
2025-05-26 13:28:01 +03:00
4456308d48 Replace environment variable config with CLI flag configuration
- Migrate from environment variables to CLI flags for configuration
- Add support for --listen, --root-path, --dir-indexing, --log-level, --response-timeout flags
- Remove config validation error struct as it's no longer needed
- Update .gitignore to exclude /dist directory
- Simplify configuration loading with flag.Parse()
2025-05-26 13:27:44 +03:00
66 changed files with 6573 additions and 610 deletions

2
.gitignore vendored
View File

@@ -2,3 +2,5 @@
**/*~
/.idea
/run.sh
/dist
/go.work*

98
AGENTS.md Normal file
View File

@@ -0,0 +1,98 @@
# CLAUDE.md
This file provides guidance to AI Agents such as Claude Code or ChatGPT Codex when working with code in this repository.
## General guidelines
Use idiomatic Go as possible. Prefer simple code than complex.
## Project Overview
Gemserve is a simple Gemini protocol server written in Go that serves static files over TLS-encrypted connections. The Gemini protocol is a lightweight, privacy-focused alternative to HTTP designed for serving text-based content.
### Development Commands
```bash
# Build, test, and format everything
make
# Run tests only
make test
# Build binaries to ./dist/ (gemserve, gemget, gembench)
make build
# Format code with gofumpt and gci
make fmt
# Run golangci-lint
make lint
# Run linter with auto-fix
make lintfix
# Clean build artifacts
make clean
# Run the server (after building)
./dist/gemserve
# Generate TLS certificates for development
certs/generate.sh
```
### Architecture
Core Components
- **cmd/gemserve/gemserve.go**: Entry point with TLS server setup, signal handling, and graceful shutdown
- **cmd/gemget/**: Gemini protocol client for fetching content
- **cmd/gembench/**: Benchmarking tool for Gemini servers
- **server/**: Request processing, file serving, and Gemini protocol response handling
- **gemini/**: Gemini protocol implementation (URL parsing, status codes, path normalization)
- **config/**: CLI-based configuration system
- **lib/logging/**: Structured logging package with context-aware loggers
- **lib/apperrors/**: Application error handling (fatal vs non-fatal errors)
- **uid/**: Connection ID generation for logging (uses external vendor package)
Key Patterns
- **Security First**: All file operations use `filepath.IsLocal()` and path cleaning to prevent directory traversal
- **Error Handling**: Uses structured errors via `lib/apperrors` package distinguishing fatal from non-fatal errors
- **Logging**: Structured logging with configurable levels via internal logging package
- **Testing**: Table-driven tests with parallel execution, heavy focus on security edge cases
Request Flow
1. TLS connection established on port 1965
2. Read up to 1KB request (Gemini spec limit)
3. Parse and normalize Gemini URL
4. Validate path security (prevent traversal)
5. Serve file or directory index with appropriate MIME type
6. Send response with proper Gemini status codes
Configuration
Server configured via CLI flags:
- `--listen`: Server address (default: localhost:1965)
- `--root-path`: Directory to serve files from
- `--dir-indexing`: Enable directory browsing (default: false)
- `--log-level`: Logging verbosity (debug, info, warn, error; default: info)
- `--response-timeout`: Response timeout in seconds (default: 30)
- `--tls-cert`: TLS certificate file path (default: certs/server.crt)
- `--tls-key`: TLS key file path (default: certs/server.key)
- `--max-response-size`: Maximum response size in bytes (default: 5242880)
Testing Strategy
- **server/server_test.go**: Path security and file serving tests
- **gemini/url_test.go**: URL parsing and normalization tests
- Focus on security edge cases (Unicode, traversal attempts, malformed URLs)
- Use parallel test execution for performance
Security Considerations
- All connections require TLS certificates (stored in certs/)
- Path traversal protection is critical - test thoroughly when modifying file serving logic
- Request size limited to 1KB per Gemini specification
- Input validation on all URLs and paths

1
CLAUDE.md Symbolic link
View File

@@ -0,0 +1 @@
./AGENTS.md

View File

@@ -1,15 +1,16 @@
FROM golang:1.23-bookworm
FROM debian:12-slim
RUN apt-get update && apt-get upgrade -y
RUN useradd -u 1000 -m user
COPY ./gemserve /app/gemserve
COPY ./dist/gemserve /app/gemserve
WORKDIR /app
RUN chmod +x /app/gemserve && \
chown -R user:user /app
chown -R root:root /app && \
chmod -R 755 /app
USER user
CMD ["/app/gemserve","0.0.0.0:1965"]
CMD ["/app/gemserve","--listen","0.0.0.0:1965","--root-path","/srv"]

View File

@@ -1,17 +1,16 @@
SHELL := /bin/env oksh
SHELL := /bin/sh
export PATH := $(PATH)
all: fmt lintfix tidy test clean build
clean:
rm -f ./gemserve
rm -rf ./dist
debug:
@echo "PATH: $(PATH)"
@echo "GOPATH: $(shell go env GOPATH)"
@which go
@which gofumpt
@which gci
@which golangci-lint
# Test
@@ -24,7 +23,6 @@ tidy:
# Format code
fmt:
gofumpt -l -w .
gci write .
# Run linter
lint: fmt
@@ -34,10 +32,13 @@ lint: fmt
lintfix: fmt
golangci-lint run --fix
build:
go build -o ./gemserve ./main.go
build: clean
mkdir -p ./dist
go build -mod=vendor -o ./dist/gemserve ./cmd/gemserve/gemserve.go
go build -mod=vendor -o ./dist/gemget ./cmd/gemget/gemget.go
go build -mod=vendor -o ./dist/gembench ./cmd/gembench/gembench.go
build-docker:
build-docker: build
docker build -t gemserve .
show-updates:

View File

@@ -18,17 +18,8 @@ make #run tests and build
Run:
```shell
LOG_LEVEL=info \
PANIC_ON_UNEXPECTED_ERROR=true \
RESPONSE_TIMEOUT=10 \ #seconds
ROOT_PATH=./srv \
DIR_INDEXING_ENABLED=false \
./gemserve 0.0.0.0:1965
./dist/gemserve
```
You'll need TLS keys, you can use `certs/generate.sh`
for quick generation.
## TODO
- [ ] Make TLS keys path configurable via venv
- [ ] Fix slowloris (proper response timeouts)

182
cmd/gembench/gembench.go Normal file
View File

@@ -0,0 +1,182 @@
package main
// Benchmarks a Gemini server.
import (
"context"
"crypto/tls"
"flag"
"fmt"
"io"
"log/slog"
"net/url"
"os"
"os/signal"
"strings"
"sync"
"syscall"
"time"
"gemserve/lib/logging"
)
func main() {
// Parse command-line flags
insecure := flag.Bool("insecure", false, "Skip TLS certificate verification")
totalConnections := flag.Int("total-connections", 250, "Total connections to make")
parallelism := flag.Int("parallelism", 10, "How many connections to run in parallel")
flag.Parse()
// Get the URL from arguments
args := flag.Args()
if len(args) != 1 {
fmt.Fprintf(os.Stderr, "Usage: gemget [--insecure] <gemini-url>\n")
os.Exit(1)
}
logging.SetupLogging(slog.LevelInfo)
logger := logging.Logger
ctx := logging.WithLogger(context.Background(), logger)
geminiURL := args[0]
host := validateUrl(geminiURL)
if host == "" {
logger.Error("Invalid URL.")
os.Exit(1)
}
start := time.Now()
err := benchmark(ctx, geminiURL, host, *insecure, *totalConnections, *parallelism)
if err != nil {
logger.Error(err.Error())
os.Exit(1)
}
end := time.Now()
tookMs := end.Sub(start).Milliseconds()
logger.Info("End.", "ms", tookMs)
}
var wg sync.WaitGroup
type ctxKey int
const ctxKeyJobIndex ctxKey = 1
func benchmark(ctx context.Context, u string, h string, insecure bool, totalConnections int, parallelism int) error {
logger := logging.FromContext(ctx)
signals := make(chan os.Signal, 1)
signal.Notify(signals, syscall.SIGINT, syscall.SIGTERM)
// Root context, used to cancel
// connections and graceful shutdown.
ctx, cancel := context.WithCancel(ctx)
defer cancel()
// Semaphore to limit concurrency.
// Goroutines put value to channel (acquire slot)
// and consume value from channel (release slot).
semaphore := make(chan struct{}, parallelism)
loop:
for i := 0; i < totalConnections; i++ {
select {
case <-signals:
logger.Warn("Received SIGINT or SIGTERM signal, shutting down gracefully")
cancel()
break loop
case semaphore <- struct{}{}: // Acquire slot
wg.Add(1)
go func(jobIndex int) {
defer func() {
<-semaphore // Release slot
wg.Done()
}()
ctxWithValue := context.WithValue(ctx, ctxKeyJobIndex, jobIndex)
ctxWithTimeout, cancel := context.WithTimeout(ctxWithValue, 60*time.Second)
defer cancel()
err := connect(ctxWithTimeout, u, h, insecure)
if err != nil {
logger.Warn(fmt.Sprintf("%d error: %v", jobIndex, err))
}
}(i)
}
}
wg.Wait()
return nil
}
func validateUrl(u string) string {
// Parse the URL
parsedURL, err := url.Parse(u)
if err != nil {
fmt.Fprintf(os.Stderr, "Error parsing URL: %v\n", err)
os.Exit(1)
}
// Ensure it's a gemini URL
if parsedURL.Scheme != "gemini" {
fmt.Fprintf(os.Stderr, "Error: URL must use gemini:// scheme\n")
os.Exit(1)
}
// Get host and port
host := parsedURL.Host
if !strings.Contains(host, ":") {
host = host + ":1965" // Default Gemini port
}
return host
}
func connect(ctx context.Context, url string, host string, insecure bool) error {
logger := logging.FromContext(ctx)
tlsConfig := &tls.Config{
InsecureSkipVerify: insecure,
MinVersion: tls.VersionTLS12,
}
// Context checkpoint
if ctx.Err() != nil {
return nil
}
// Connect to the server
conn, err := tls.Dial("tcp", host, tlsConfig)
if err != nil {
return err
}
// Set connection deadline based on context
if deadline, ok := ctx.Deadline(); ok {
_ = conn.SetDeadline(deadline)
}
defer func() {
_ = conn.Close()
}()
// Context checkpoint
if ctx.Err() != nil {
return nil
}
// Send the request (URL + CRLF)
request := url + "\r\n"
_, err = conn.Write([]byte(request))
if err != nil {
return err
}
// Context checkpoint
if ctx.Err() != nil {
return nil
}
// Read and dump response
_, err = io.Copy(io.Discard, conn)
if err != nil {
return err
}
jobIndex := ctx.Value(ctxKeyJobIndex)
logger.Debug(fmt.Sprintf("%d done", jobIndex))
return nil
}

88
cmd/gemget/gemget.go Normal file
View File

@@ -0,0 +1,88 @@
package main
// Simply does Gemini requests and prints output.
import (
"crypto/tls"
"flag"
"fmt"
"io"
"net/url"
"os"
"strings"
)
func main() {
// Parse command-line flags
insecure := flag.Bool("insecure", false, "Skip TLS certificate verification")
flag.Parse()
// Get the URL from arguments
args := flag.Args()
if len(args) != 1 {
fmt.Fprintf(os.Stderr, "Usage: gemget [--insecure] <gemini-url>\n")
os.Exit(1)
}
geminiURL := args[0]
host := validateUrl(geminiURL)
connect(geminiURL, host, *insecure)
}
func validateUrl(u string) string {
// Parse the URL
parsedURL, err := url.Parse(u)
if err != nil {
fmt.Fprintf(os.Stderr, "Error parsing URL: %v\n", err)
os.Exit(1)
}
// Ensure it's a gemini URL
if parsedURL.Scheme != "gemini" {
fmt.Fprintf(os.Stderr, "Error: URL must use gemini:// scheme\n")
os.Exit(1)
}
// Get host and port
host := parsedURL.Host
if !strings.Contains(host, ":") {
host = host + ":1965" // Default Gemini port
}
return host
}
func connect(url string, host string, insecure bool) {
// Configure TLS
tlsConfig := &tls.Config{
InsecureSkipVerify: insecure,
MinVersion: tls.VersionTLS12,
}
// Connect to the server
conn, err := tls.Dial("tcp", host, tlsConfig)
if err != nil {
fmt.Fprintf(os.Stderr, "Error connecting to server: %v\n", err)
os.Exit(1)
}
defer func() {
_ = conn.Close()
}()
// Send the request (URL + CRLF)
request := url + "\r\n"
_, err = conn.Write([]byte(request))
if err != nil {
fmt.Fprintf(os.Stderr, "Error sending request: %v\n", err)
os.Exit(1)
}
// Read and print the response to stdout
_, err = io.Copy(os.Stdout, conn)
if err != nil {
fmt.Fprintf(os.Stderr, "Error reading response: %v\n", err)
os.Exit(1)
}
}

176
cmd/gemserve/gemserve.go Normal file
View File

@@ -0,0 +1,176 @@
package main
// A Gemini server.
import (
"context"
"crypto/tls"
"fmt"
"net"
"net/http"
_ "net/http/pprof"
"os"
"os/signal"
"sync"
"syscall"
"time"
"gemserve/lib/apperrors"
"gemserve/lib/logging"
"gemserve/config"
"gemserve/server"
"git.antanst.com/antanst/uid"
)
func main() {
config.CONFIG = *config.GetConfig()
logging.SetupLogging(config.CONFIG.LogLevel)
logger := logging.Logger
ctx := logging.WithLogger(context.Background(), logger)
err := runApp(ctx)
if err != nil {
logger.Error("Fatal Error", "err", err)
panic(fmt.Sprintf("Fatal Error: %v", err))
}
os.Exit(0)
}
func runApp(ctx context.Context) error {
logger := logging.FromContext(ctx)
logger.Info("Starting up. Press Ctrl+C to exit")
listenAddr := config.CONFIG.ListenAddr
// Start pprof HTTP server if enabled
if config.CONFIG.PprofAddr != "" {
go func() {
logger.Info("Starting pprof HTTP server", "address", config.CONFIG.PprofAddr)
if err := http.ListenAndServe(config.CONFIG.PprofAddr, nil); err != nil {
panic(fmt.Sprintf("pprof HTTP server failed: %v", err))
}
}()
}
signals := make(chan os.Signal, 1)
signal.Notify(signals, syscall.SIGINT, syscall.SIGTERM)
// Only this file should send to this channel.
// All external functions should return errors.
fatalErrors := make(chan error)
// Root context, used to cancel
// connections and graceful shutdown.
serverCtx, cancel := context.WithCancel(ctx)
defer cancel()
// WaitGroup to track active connections
// in order to be able to wait until
// they are properly dropped
var wg sync.WaitGroup
// Spawn server on the background.
// Returned errors are considered fatal.
go func() {
err := startServer(serverCtx, listenAddr, &wg, fatalErrors)
if err != nil {
fatalErrors <- apperrors.NewFatalError(fmt.Errorf("server startup failed: %w", err))
}
}()
for {
select {
case <-signals:
logger.Warn("Received SIGINT or SIGTERM signal, shutting down gracefully")
cancel()
wg.Wait()
return nil
case fatalError := <-fatalErrors:
cancel()
wg.Wait()
return fatalError
}
}
}
func startServer(ctx context.Context, listenAddr string, wg *sync.WaitGroup, fatalErrors chan<- error) (err error) {
logger := logging.FromContext(ctx)
cert, err := tls.LoadX509KeyPair(config.CONFIG.TLSCert, config.CONFIG.TLSKey)
if err != nil {
return apperrors.NewFatalError(fmt.Errorf("failed to load TLS certificate/key: %w", err))
}
logger.Debug("Using TLS cert", "path", config.CONFIG.TLSCert)
logger.Debug("Using TLS key", "path", config.CONFIG.TLSKey)
tlsConfig := &tls.Config{
Certificates: []tls.Certificate{cert},
MinVersion: tls.VersionTLS12,
}
listener, err := tls.Listen("tcp", listenAddr, tlsConfig)
if err != nil {
return apperrors.NewFatalError(err)
}
defer func(listener net.Listener) {
_ = listener.Close()
}(listener)
// If context is cancelled, close listener
// to unblock Accept() inside main loop.
go func() {
<-ctx.Done()
_ = listener.Close()
}()
logger.Info("Server listening", "address", listenAddr)
for {
conn, err := listener.Accept()
if err != nil {
if ctx.Err() != nil {
return nil
} // ctx cancellation
logger.Info("Failed to accept connection: %v", "error", err)
continue
}
// At this point we have a new connection.
wg.Add(1)
go func() {
defer wg.Done()
// Type assert the connection to TLS connection
tlsConn, ok := conn.(*tls.Conn)
if !ok {
logger.Error("Connection is not a TLS connection")
_ = conn.Close()
return
}
remoteAddr := conn.RemoteAddr().String()
connId := uid.UID()
// Create cancellable connection context
// with connection ID.
connLogger := logging.WithAttr(logger, "id", connId)
connLogger = logging.WithAttr(connLogger, "remoteAddr", remoteAddr)
connCtx := context.WithValue(ctx, server.CtxConnIdKey, connId)
connCtx = context.WithValue(connCtx, logging.CtxLoggerKey, connLogger)
connCtx, cancel := context.WithTimeout(connCtx, time.Duration(config.CONFIG.ResponseTimeout)*time.Second)
defer cancel()
err := server.HandleConnection(connCtx, tlsConn)
if err != nil {
if apperrors.IsFatal(err) {
fatalErrors <- err
return
}
connLogger.Info("Connection failed", "error", err)
}
}()
}
}

View File

@@ -1,126 +1,81 @@
package config
import (
"flag"
"fmt"
"log/slog"
"os"
"strconv"
"github.com/rs/zerolog"
"strings"
)
// Environment variable names.
const (
EnvLogLevel = "LOG_LEVEL"
EnvResponseTimeout = "RESPONSE_TIMEOUT"
EnvPanicOnUnexpectedError = "PANIC_ON_UNEXPECTED_ERROR"
EnvRootPath = "ROOT_PATH"
EnvDirIndexingEnabled = "DIR_INDEXING_ENABLED"
)
// Config holds the application configuration loaded from environment variables.
// Config holds the application configuration loaded from CLI flags.
type Config struct {
LogLevel zerolog.Level // Logging level (debug, info, warn, error)
LogLevel slog.Level // Logging level (debug, info, warn, error)
ResponseTimeout int // Timeout for responses in seconds
PanicOnUnexpectedError bool // Panic on unexpected errors when visiting a URL
RootPath string // Path to serve files from
DirIndexingEnabled bool // Allow client to browse directories or not
ListenAddr string // Address to listen on
TLSCert string // TLS certificate file
TLSKey string // TLS key file
MaxResponseSize int // Max response size in bytes
PprofAddr string // Address for pprof HTTP endpoint (empty = disabled)
}
var CONFIG Config //nolint:gochecknoglobals
// parsePositiveInt parses and validates positive integer values.
func parsePositiveInt(param, value string) (int, error) {
val, err := strconv.Atoi(value)
if err != nil {
return 0, ValidationError{
Param: param,
Value: value,
Reason: "must be a valid integer",
// parseLogLevel parses a log level string into slog.Level
func parseLogLevel(level string) (slog.Level, error) {
switch strings.ToLower(level) {
case "debug":
return slog.LevelDebug, nil
case "info":
return slog.LevelInfo, nil
case "warn", "warning":
return slog.LevelWarn, nil
case "error":
return slog.LevelError, nil
default:
return slog.LevelInfo, fmt.Errorf("invalid log level: %s", level)
}
}
if val <= 0 {
return 0, ValidationError{
Param: param,
Value: value,
Reason: "must be positive",
}
}
return val, nil
}
func parseBool(param, value string) (bool, error) {
val, err := strconv.ParseBool(value)
if err != nil {
return false, ValidationError{
Param: param,
Value: value,
Reason: "cannot be converted to boolean",
}
}
return val, nil
}
// GetConfig loads and validates configuration from environment variables
// GetConfig loads and validates configuration from CLI flags
func GetConfig() *Config {
config := &Config{}
// Define CLI flags with defaults
logLevel := flag.String("log-level", "info", "Logging level (debug, info, warn, error)")
responseTimeout := flag.Int("response-timeout", 30, "Timeout for responses in seconds")
rootPath := flag.String("root-path", "", "Path to serve files from")
dirIndexing := flag.Bool("dir-indexing", false, "Allow client to browse directories")
listen := flag.String("listen", "localhost:1965", "Address to listen on")
tlsCert := flag.String("tls-cert", "certs/server.crt", "TLS certificate file")
tlsKey := flag.String("tls-key", "certs/server.key", "TLS key file")
maxResponseSize := flag.Int("max-response-size", 5_242_880, "Max response size in bytes")
pprofAddr := flag.String("pprof-addr", "", "Address for pprof HTTP endpoint (empty = disabled)")
// Map of environment variables to their parsing functions
parsers := map[string]func(string) error{
EnvLogLevel: func(v string) error {
level, err := zerolog.ParseLevel(v)
if err != nil {
return ValidationError{
Param: EnvLogLevel,
Value: v,
Reason: "must be one of: debug, info, warn, error",
}
}
config.LogLevel = level
return nil
},
EnvResponseTimeout: func(v string) error {
val, err := parsePositiveInt(EnvResponseTimeout, v)
if err != nil {
return err
}
config.ResponseTimeout = val
return nil
},
EnvPanicOnUnexpectedError: func(v string) error {
val, err := parseBool(EnvPanicOnUnexpectedError, v)
if err != nil {
return err
}
config.PanicOnUnexpectedError = val
return nil
},
EnvRootPath: func(v string) error {
config.RootPath = v
return nil
},
EnvDirIndexingEnabled: func(v string) error {
val, err := parseBool(EnvDirIndexingEnabled, v)
if err != nil {
return err
}
config.DirIndexingEnabled = val
return nil
},
}
flag.Parse()
// Process each environment variable
for envVar, parser := range parsers {
value, ok := os.LookupEnv(envVar)
if !ok {
_, _ = fmt.Fprintf(os.Stderr, "Missing required environment variable: %s\n", envVar)
// Parse and validate log level
level, err := parseLogLevel(*logLevel)
if err != nil {
_, _ = fmt.Fprintf(os.Stderr, "Invalid log level '%s': must be one of: debug, info, warn, error\n", *logLevel)
os.Exit(1)
}
if err := parser(value); err != nil {
_, _ = fmt.Fprintf(os.Stderr, "Configuration error: %v\n", err)
// Validate response timeout
if *responseTimeout <= 0 {
_, _ = fmt.Fprintf(os.Stderr, "Invalid response timeout '%d': must be positive\n", *responseTimeout)
os.Exit(1)
}
}
return config
return &Config{
LogLevel: level,
ResponseTimeout: *responseTimeout,
RootPath: *rootPath,
DirIndexingEnabled: *dirIndexing,
ListenAddr: *listen,
TLSCert: *tlsCert,
TLSKey: *tlsKey,
MaxResponseSize: *maxResponseSize,
PprofAddr: *pprofAddr,
}
}

View File

@@ -1,14 +0,0 @@
package config
import "fmt"
// ValidationError represents a config validation error
type ValidationError struct {
Param string
Value string
Reason string
}
func (e ValidationError) Error() string {
return fmt.Sprintf("invalid value '%s' for %s: %s", e.Value, e.Param, e.Reason)
}

View File

@@ -1,114 +0,0 @@
package errors
import (
"errors"
"fmt"
"runtime"
"strings"
)
type fatal interface {
Fatal() bool
}
func IsFatal(err error) bool {
te, ok := errors.Unwrap(err).(fatal)
return ok && te.Fatal()
}
func As(err error, target any) bool {
return errors.As(err, target)
}
func Is(err, target error) bool {
return errors.Is(err, target)
}
func Unwrap(err error) error {
return errors.Unwrap(err)
}
type Error struct {
Err error
Stack string
fatal bool
}
func (e *Error) Error() string {
var sb strings.Builder
sb.WriteString(fmt.Sprintf("%v\n", e.Err))
return sb.String()
}
func (e *Error) ErrorWithStack() string {
var sb strings.Builder
sb.WriteString(fmt.Sprintf("%v\n", e.Err))
sb.WriteString(fmt.Sprintf("Stack Trace:\n%s", e.Stack))
return sb.String()
}
func (e *Error) Fatal() bool {
return e.fatal
}
func (e *Error) Unwrap() error {
return e.Err
}
func NewError(err error) error {
if err == nil {
return nil
}
// Check if it's already of our own
// Error type, so we don't add stack twice.
var asError *Error
if errors.As(err, &asError) {
return err
}
// Get the stack trace
var stack strings.Builder
buf := make([]uintptr, 50)
n := runtime.Callers(2, buf)
frames := runtime.CallersFrames(buf[:n])
// Format the stack trace
for {
frame, more := frames.Next()
// Skip runtime and standard library frames
if !strings.Contains(frame.File, "runtime/") {
stack.WriteString(fmt.Sprintf("\t%s:%d - %s\n", frame.File, frame.Line, frame.Function))
}
if !more {
break
}
}
return &Error{
Err: err,
Stack: stack.String(),
}
}
func NewFatalError(err error) error {
if err == nil {
return nil
}
// Check if it's already of our own
// Error type.
var asError *Error
if errors.As(err, &asError) {
return err
}
err2 := NewError(err)
err2.(*Error).fatal = true
return err2
}
var ConnectionError error = fmt.Errorf("connection error")
func NewConnectionError(err error) error {
return fmt.Errorf("%w: %w", ConnectionError, err)
}

View File

@@ -1,71 +0,0 @@
package errors
import (
"errors"
"fmt"
"testing"
)
type CustomError struct {
Err error
}
func (e *CustomError) Error() string { return e.Err.Error() }
func IsCustomError(err error) bool {
var asError *CustomError
return errors.As(err, &asError)
}
func TestWrapping(t *testing.T) {
t.Parallel()
originalErr := errors.New("original error")
err1 := NewError(originalErr)
if !errors.Is(err1, originalErr) {
t.Errorf("original error is not wrapped")
}
if !Is(err1, originalErr) {
t.Errorf("original error is not wrapped")
}
unwrappedErr := errors.Unwrap(err1)
if !errors.Is(unwrappedErr, originalErr) {
t.Errorf("original error is not wrapped")
}
if !Is(unwrappedErr, originalErr) {
t.Errorf("original error is not wrapped")
}
unwrappedErr = Unwrap(err1)
if !errors.Is(unwrappedErr, originalErr) {
t.Errorf("original error is not wrapped")
}
if !Is(unwrappedErr, originalErr) {
t.Errorf("original error is not wrapped")
}
wrappedErr := fmt.Errorf("wrapped: %w", originalErr)
if !errors.Is(wrappedErr, originalErr) {
t.Errorf("original error is not wrapped")
}
if !Is(wrappedErr, originalErr) {
t.Errorf("original error is not wrapped")
}
}
func TestNewError(t *testing.T) {
t.Parallel()
originalErr := &CustomError{errors.New("err1")}
if !IsCustomError(originalErr) {
t.Errorf("TestNewError fail #1")
}
err1 := NewError(originalErr)
if !IsCustomError(err1) {
t.Errorf("TestNewError fail #2")
}
wrappedErr1 := fmt.Errorf("wrapped %w", err1)
if !IsCustomError(wrappedErr1) {
t.Errorf("TestNewError fail #3")
}
unwrappedErr1 := Unwrap(wrappedErr1)
if !IsCustomError(unwrappedErr1) {
t.Errorf("TestNewError fail #4")
}
}

33
gemini/statusCodes.go Normal file
View File

@@ -0,0 +1,33 @@
package gemini
// Gemini status codes as defined in the Gemini spec
// gemini://geminiprotocol.net/docs/protocol-specification.gmi
const (
// Input group
StatusInputExpected = 10
StatusInputExpectedSensitive = 11
StatusSuccess = 20
// Redirect group
StatusRedirectTemporary = 30
StatusRedirectPermanent = 31
// Temporary failure group
StatusTemporaryFailure = 40
StatusServerUnavailable = 41
StatusCGIError = 42
StatusProxyError = 43
StatusSlowDown = 44
// Permanent failure group
StatusPermanentFailure = 50
StatusNotFound = 51
StatusGone = 52
StatusProxyRequestRefused = 53
StatusBadRequest = 59
// TLS certificate group
StatusCertificateRequired = 60
StatusCertificateNotAuthorized = 61
StatusCertificateNotValid = 62
)

View File

@@ -1,4 +1,4 @@
package common
package gemini
import (
"database/sql/driver"
@@ -8,7 +8,7 @@ import (
"strconv"
"strings"
"gemserve/errors"
"gemserve/lib/apperrors"
)
type URL struct {
@@ -28,7 +28,7 @@ func (u *URL) Scan(value interface{}) error {
}
b, ok := value.(string)
if !ok {
return errors.NewFatalError(fmt.Errorf("database scan error: expected string, got %T", value))
return apperrors.NewFatalError(fmt.Errorf("database scan error: expected string, got %T", value))
}
parsedURL, err := ParseURL(b, "", false)
if err != nil {
@@ -67,12 +67,10 @@ func ParseURL(input string, descr string, normalize bool) (*URL, error) {
} else {
u, err = url.Parse(input)
if err != nil {
return nil, errors.NewError(fmt.Errorf("error parsing URL: %w: %s", err, input))
return nil, fmt.Errorf("error parsing URL: %w: %s", err, input)
}
}
if u.Scheme != "gemini" {
return nil, errors.NewError(fmt.Errorf("error parsing URL: not a gemini URL: %s", input))
}
protocol := u.Scheme
hostname := u.Hostname()
strPort := u.Port()
@@ -82,7 +80,7 @@ func ParseURL(input string, descr string, normalize bool) (*URL, error) {
}
port, err := strconv.Atoi(strPort)
if err != nil {
return nil, errors.NewError(fmt.Errorf("error parsing URL: %w: %s", err, input))
return nil, fmt.Errorf("error parsing URL: %w: %s", err, input)
}
full := fmt.Sprintf("%s://%s:%d%s", protocol, hostname, port, urlPath)
// full field should also contain query params and url fragments
@@ -128,13 +126,13 @@ func NormalizeURL(rawURL string) (*url.URL, error) {
// Parse the URL
u, err := url.Parse(rawURL)
if err != nil {
return nil, errors.NewError(fmt.Errorf("error normalizing URL: %w: %s", err, rawURL))
return nil, fmt.Errorf("error normalizing URL: %w: %s", err, rawURL)
}
if u.Scheme == "" {
return nil, errors.NewError(fmt.Errorf("error normalizing URL: No scheme: %s", rawURL))
return nil, fmt.Errorf("error normalizing URL: No scheme: %s", rawURL)
}
if u.Host == "" {
return nil, errors.NewError(fmt.Errorf("error normalizing URL: No host: %s", rawURL))
return nil, fmt.Errorf("error normalizing URL: No host: %s", rawURL)
}
// Convert scheme to lowercase

View File

@@ -1,4 +1,4 @@
package common
package gemini
import (
"net/url"
@@ -11,7 +11,7 @@ func TestParseURL(t *testing.T) {
input := "gemini://caolan.uk/cgi-bin/weather.py/wxfcs/3162"
parsed, err := ParseURL(input, "", true)
value, _ := parsed.Value()
if err != nil || !(value == "gemini://caolan.uk:1965/cgi-bin/weather.py/wxfcs/3162") {
if err != nil || (value != "gemini://caolan.uk:1965/cgi-bin/weather.py/wxfcs/3162") {
t.Errorf("fail: %s", parsed)
}
}

14
go.mod
View File

@@ -1,16 +1,12 @@
module gemserve
go 1.23.4
go 1.24.3
require (
github.com/gabriel-vasile/mimetype v1.4.8
git.antanst.com/antanst/uid v0.0.1
github.com/gabriel-vasile/mimetype v1.4.10
github.com/lmittmann/tint v1.1.2
github.com/matoous/go-nanoid/v2 v2.1.0
github.com/rs/zerolog v1.33.0
)
require (
github.com/mattn/go-colorable v0.1.13 // indirect
github.com/mattn/go-isatty v0.0.19 // indirect
golang.org/x/net v0.33.0 // indirect
golang.org/x/sys v0.29.0 // indirect
)
replace git.antanst.com/antanst/uid => ../uid

24
go.sum
View File

@@ -1,30 +1,14 @@
github.com/coreos/go-systemd/v22 v22.5.0/go.mod h1:Y58oyj3AT4RCenI/lSvhwexgC+NSVTIJ3seZv2GcEnc=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/gabriel-vasile/mimetype v1.4.8 h1:FfZ3gj38NjllZIeJAmMhr+qKL8Wu+nOoI3GqacKw1NM=
github.com/gabriel-vasile/mimetype v1.4.8/go.mod h1:ByKUIKGjh1ODkGM1asKUbQZOLGrPjydw3hYPU2YU9t8=
github.com/godbus/dbus/v5 v5.0.4/go.mod h1:xhWf0FNVPg57R7Z0UbKHbJfkEywrmjJnf7w5xrFpKfA=
github.com/gabriel-vasile/mimetype v1.4.10 h1:zyueNbySn/z8mJZHLt6IPw0KoZsiQNszIpU+bX4+ZK0=
github.com/gabriel-vasile/mimetype v1.4.10/go.mod h1:d+9Oxyo1wTzWdyVUPMmXFvp4F9tea18J8ufA774AB3s=
github.com/lmittmann/tint v1.1.2 h1:2CQzrL6rslrsyjqLDwD11bZ5OpLBPU+g3G/r5LSfS8w=
github.com/lmittmann/tint v1.1.2/go.mod h1:HIS3gSy7qNwGCj+5oRjAutErFBl4BzdQP6cJZ0NfMwE=
github.com/matoous/go-nanoid/v2 v2.1.0 h1:P64+dmq21hhWdtvZfEAofnvJULaRR1Yib0+PnU669bE=
github.com/matoous/go-nanoid/v2 v2.1.0/go.mod h1:KlbGNQ+FhrUNIHUxZdL63t7tl4LaPkZNpUULS8H4uVM=
github.com/mattn/go-colorable v0.1.13 h1:fFA4WZxdEF4tXPZVKMLwD8oUnCTTo08duU7wxecdEvA=
github.com/mattn/go-colorable v0.1.13/go.mod h1:7S9/ev0klgBDR4GtXTXX8a3vIGJpMovkB8vQcUbaXHg=
github.com/mattn/go-isatty v0.0.16/go.mod h1:kYGgaQfpe5nmfYZH+SKPsOc2e4SrIfOl2e/yFXSvRLM=
github.com/mattn/go-isatty v0.0.19 h1:JITubQf0MOLdlGRuRq+jtsDlekdYPia9ZFsB8h/APPA=
github.com/mattn/go-isatty v0.0.19/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/rs/xid v1.5.0/go.mod h1:trrq9SKmegXys3aeAKXMUTdJsYXVwGY3RLcfgqegfbg=
github.com/rs/zerolog v1.33.0 h1:1cU2KZkvPxNyfgEmhHAz/1A9Bz+llsdYzklWFzgp0r8=
github.com/rs/zerolog v1.33.0/go.mod h1:/7mN4D5sKwJLZQ2b/znpjC3/GQWY/xaDXUM0kKWRHss=
github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
golang.org/x/net v0.33.0 h1:74SYHlV8BIgHIFC/LrYkOGIwL19eTYXQ5wc6TBuO36I=
golang.org/x/net v0.33.0/go.mod h1:HXLR5J+9DxmrqMwG9qjGCxZ+zKXxBru04zlTvWlWuN4=
golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.29.0 h1:TPYlXGxvx1MGTn2GiZDhnjPA9wZzZeGKHHmKhHYvgaU=
golang.org/x/sys v0.29.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

94
lib/apperrors/errors.go Normal file
View File

@@ -0,0 +1,94 @@
package apperrors
import "errors"
type ErrorType int
const (
ErrorOther ErrorType = iota
ErrorNetwork
ErrorGemini
)
type AppError struct {
Type ErrorType
StatusCode int
Fatal bool
Err error
}
func (e *AppError) Error() string {
if e == nil || e.Err == nil {
return ""
}
return e.Err.Error()
}
func (e *AppError) Unwrap() error {
if e == nil {
return nil
}
return e.Err
}
func NewError(err error) error {
return &AppError{
Type: ErrorOther,
StatusCode: 0,
Fatal: false,
Err: err,
}
}
func NewFatalError(err error) error {
return &AppError{
Type: ErrorOther,
StatusCode: 0,
Fatal: true,
Err: err,
}
}
func NewNetworkError(err error) error {
return &AppError{
Type: ErrorNetwork,
StatusCode: 0,
Fatal: false,
Err: err,
}
}
func NewGeminiError(err error, statusCode int) error {
return &AppError{
Type: ErrorGemini,
StatusCode: statusCode,
Fatal: false,
Err: err,
}
}
func GetStatusCode(err error) int {
var appError *AppError
if errors.As(err, &appError) && appError != nil {
return appError.StatusCode
}
return 0
}
func IsGeminiError(err error) bool {
var appError *AppError
if errors.As(err, &appError) && appError != nil {
if appError.Type == ErrorGemini {
return true
}
}
return false
}
func IsFatal(err error) bool {
var appError *AppError
if errors.As(err, &appError) && appError != nil {
return appError.Fatal
}
return false
}

76
lib/logging/logging.go Normal file
View File

@@ -0,0 +1,76 @@
package logging
import (
"context"
"log/slog"
"os"
"path/filepath"
"github.com/lmittmann/tint"
)
type contextKey string
const CtxLoggerKey contextKey = "logger"
var (
programLevel *slog.LevelVar = new(slog.LevelVar) // Info by default
Logger *slog.Logger
)
func WithLogger(ctx context.Context, logger *slog.Logger) context.Context {
return context.WithValue(ctx, CtxLoggerKey, logger)
}
func WithAttr(logger *slog.Logger, attrName string, attrValue interface{}) *slog.Logger {
return logger.With(attrName, attrValue)
}
func FromContext(ctx context.Context) *slog.Logger {
if logger, ok := ctx.Value(CtxLoggerKey).(*slog.Logger); ok {
return logger
}
// Return default logger instead of panicking
return slog.Default()
}
func SetupLogging(logLevel slog.Level) {
programLevel.Set(logLevel)
// With coloring (uses external package)
opts := &tint.Options{
AddSource: true,
Level: programLevel,
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Key == slog.SourceKey {
if source, ok := a.Value.Any().(*slog.Source); ok {
source.File = filepath.Base(source.File)
}
}
// Customize level colors:
// gray for debug, blue for info,
// yellow for warnings, black on red bg for errors
if a.Key == slog.LevelKey && len(groups) == 0 {
level, ok := a.Value.Any().(slog.Level)
if ok {
switch level {
case slog.LevelDebug:
// Use grayscale color (232-255 range) for gray/faint debug messages
return tint.Attr(240, a)
case slog.LevelInfo:
// Use color code 12 (bright blue) for info
return tint.Attr(12, a)
case slog.LevelWarn:
// Use color code 11 (bright yellow) for warnings
return tint.Attr(11, a)
case slog.LevelError:
// For black on red background, we need to modify the string value directly
// since tint.Attr() only supports foreground colors
return slog.String(a.Key, "\033[30;41m"+a.Value.String()+"\033[0m")
}
}
}
return a
},
}
Logger = slog.New(tint.NewHandler(os.Stdout, opts))
}

View File

@@ -1,23 +0,0 @@
package logging
import (
"fmt"
zlog "github.com/rs/zerolog/log"
)
func LogDebug(format string, args ...interface{}) {
zlog.Debug().Msg(fmt.Sprintf(format, args...))
}
func LogInfo(format string, args ...interface{}) {
zlog.Info().Msg(fmt.Sprintf(format, args...))
}
func LogWarn(format string, args ...interface{}) {
zlog.Warn().Msg(fmt.Sprintf(format, args...))
}
func LogError(format string, args ...interface{}) {
zlog.Error().Err(fmt.Errorf(format, args...)).Msg("")
}

189
main.go
View File

@@ -1,189 +0,0 @@
package main
import (
"bytes"
"crypto/tls"
"fmt"
"io"
"net"
"os"
"os/signal"
"strings"
"syscall"
"time"
"gemserve/config"
"gemserve/errors"
"gemserve/logging"
"gemserve/server"
"gemserve/uid"
"github.com/rs/zerolog"
zlog "github.com/rs/zerolog/log"
)
func main() {
config.CONFIG = *config.GetConfig()
zerolog.TimeFieldFormat = zerolog.TimeFormatUnix
zerolog.SetGlobalLevel(config.CONFIG.LogLevel)
zlog.Logger = zlog.Output(zerolog.ConsoleWriter{Out: os.Stderr, TimeFormat: "[2006-01-02 15:04:05]"})
err := runApp()
if err != nil {
fmt.Printf("%v\n", err)
logging.LogError("%v", err)
os.Exit(1)
}
}
func runApp() error {
logging.LogInfo("Starting up. Press Ctrl+C to exit")
var listenHost string
if len(os.Args) != 2 {
listenHost = "0.0.0.0:1965"
} else {
listenHost = os.Args[1]
}
signals := make(chan os.Signal, 1)
signal.Notify(signals, syscall.SIGINT, syscall.SIGTERM)
serverErrors := make(chan error)
go func() {
err := startServer(listenHost)
if err != nil {
serverErrors <- errors.NewFatalError(err)
}
}()
for {
select {
case <-signals:
logging.LogWarn("Received SIGINT or SIGTERM signal, exiting")
return nil
case serverError := <-serverErrors:
return errors.NewFatalError(serverError)
}
}
}
func startServer(listenHost string) (err error) {
cert, err := tls.LoadX509KeyPair("/certs/cert", "/certs/key")
if err != nil {
return errors.NewFatalError(fmt.Errorf("failed to load certificate: %w", err))
}
tlsConfig := &tls.Config{
Certificates: []tls.Certificate{cert},
MinVersion: tls.VersionTLS12,
}
listener, err := tls.Listen("tcp", listenHost, tlsConfig)
if err != nil {
return errors.NewFatalError(fmt.Errorf("failed to create listener: %w", err))
}
defer func(listener net.Listener) {
// If we've got an error closing the
// listener, make sure we don't override
// the original error (if not nil)
errClose := listener.Close()
if errClose != nil && err == nil {
err = errors.NewFatalError(err)
}
}(listener)
logging.LogInfo("Server listening on %s", listenHost)
for {
conn, err := listener.Accept()
if err != nil {
logging.LogInfo("Failed to accept connection: %v", err)
continue
}
go func() {
err := handleConnection(conn.(*tls.Conn))
if err != nil {
var asErr *errors.Error
if errors.As(err, &asErr) {
logging.LogError("Unexpected error: %v", err.(*errors.Error).ErrorWithStack())
} else {
logging.LogError("Unexpected error: %v", err)
}
if config.CONFIG.PanicOnUnexpectedError {
panic("Encountered unexpected error")
}
}
}()
}
}
func closeConnection(conn *tls.Conn) error {
err := conn.CloseWrite()
if err != nil {
return errors.NewConnectionError(fmt.Errorf("failed to close TLS connection: %w", err))
}
err = conn.Close()
if err != nil {
return errors.NewConnectionError(fmt.Errorf("failed to close connection: %w", err))
}
return nil
}
func handleConnection(conn *tls.Conn) (err error) {
remoteAddr := conn.RemoteAddr().String()
connId := uid.UID()
start := time.Now()
var outputBytes []byte
defer func(conn *tls.Conn) {
// Three possible cases here:
// - We don't have an error
// - We have a ConnectionError, which we don't propagate up
// - We have an unexpected error.
end := time.Now()
tookMs := end.Sub(start).Milliseconds()
var responseHeader string
if err != nil {
_, _ = conn.Write([]byte("50 server error"))
responseHeader = "50 server error"
// We don't propagate connection errors up.
if errors.Is(err, errors.ConnectionError) {
logging.LogInfo("%s %s %v", connId, remoteAddr, err)
err = nil
}
} else {
if i := bytes.Index(outputBytes, []byte{'\r'}); i >= 0 {
responseHeader = string(outputBytes[:i])
}
}
logging.LogInfo("%s %s response %s (%dms)", connId, remoteAddr, responseHeader, tookMs)
_ = closeConnection(conn)
}(conn)
// Gemini is supposed to have a 1kb limit
// on input requests.
buffer := make([]byte, 1024)
n, err := conn.Read(buffer)
if err != nil && err != io.EOF {
return errors.NewConnectionError(fmt.Errorf("failed to read connection data: %w", err))
}
if n == 0 {
return errors.NewConnectionError(fmt.Errorf("client did not send data"))
}
dataBytes := buffer[:n]
dataString := string(dataBytes)
logging.LogInfo("%s %s request %s (%d bytes)", connId, remoteAddr, strings.TrimSpace(dataString), len(dataBytes))
outputBytes, err = server.GenerateResponse(conn, connId, dataString)
if err != nil {
return err
}
_, err = conn.Write(outputBytes)
if err != nil {
return err
}
return nil
}

View File

@@ -1,61 +1,123 @@
package server
import (
"bytes"
"context"
"crypto/tls"
"errors"
"fmt"
"io"
"net"
"net/url"
"os"
"path"
"path/filepath"
"strconv"
"strings"
"time"
"unicode/utf8"
"gemserve/lib/apperrors"
"gemserve/lib/logging"
"gemserve/common"
"gemserve/config"
"gemserve/errors"
"gemserve/logging"
"gemserve/gemini"
"github.com/gabriel-vasile/mimetype"
)
type contextKey string
const CtxConnIdKey contextKey = "connId"
type ServerConfig interface {
DirIndexingEnabled() bool
RootPath() string
}
func GenerateResponse(conn *tls.Conn, connId string, input string) ([]byte, error) {
func CloseConnection(conn *tls.Conn) error {
err := conn.CloseWrite()
if err != nil {
return apperrors.NewNetworkError(fmt.Errorf("failed to close TLS connection: %w", err))
}
err = conn.Close()
if err != nil {
return apperrors.NewNetworkError(fmt.Errorf("failed to close connection: %w", err))
}
return nil
}
func checkRequestURL(url *gemini.URL) error {
if !utf8.ValidString(url.String()) {
return apperrors.NewGeminiError(fmt.Errorf("invalid URL"), gemini.StatusBadRequest)
}
if url.Protocol != "gemini" {
return apperrors.NewGeminiError(fmt.Errorf("invalid URL"), gemini.StatusProxyRequestRefused)
}
_, portStr, err := net.SplitHostPort(config.CONFIG.ListenAddr)
if err != nil {
return apperrors.NewGeminiError(fmt.Errorf("failed to parse server listen address: %w", err), gemini.StatusBadRequest)
}
listenPort, err := strconv.Atoi(portStr)
if err != nil {
return apperrors.NewGeminiError(fmt.Errorf("invalid server listen port: %w", err), gemini.StatusBadRequest)
}
if url.Port != listenPort {
return apperrors.NewGeminiError(fmt.Errorf("port mismatch"), gemini.StatusProxyRequestRefused)
}
return nil
}
func GenerateResponse(ctx context.Context, conn *tls.Conn, input string) ([]byte, error) {
logger := logging.FromContext(ctx)
connId := ctx.Value(CtxConnIdKey).(string)
if err := ctx.Err(); err != nil {
return nil, err
}
trimmedInput := strings.TrimSpace(input)
// url will have a cleaned and normalized path after this
url, err := common.ParseURL(trimmedInput, "", true)
url, err := gemini.ParseURL(trimmedInput, "", true)
if err != nil {
return nil, errors.NewConnectionError(fmt.Errorf("failed to parse URL: %w", err))
return nil, apperrors.NewGeminiError(fmt.Errorf("failed to parse URL: %w", err), gemini.StatusBadRequest)
}
logging.LogDebug("%s %s normalized URL path: %s", connId, conn.RemoteAddr(), url.Path)
logger.Debug("normalized URL path", "id", connId, "remoteAddr", conn.RemoteAddr(), "path", url.Path)
err = checkRequestURL(url)
if err != nil {
return nil, err
}
serverRootPath := config.CONFIG.RootPath
localPath, err := calculateLocalPath(url.Path, serverRootPath)
if err != nil {
return nil, errors.NewConnectionError(err)
return nil, apperrors.NewGeminiError(err, gemini.StatusBadRequest)
}
logging.LogDebug("%s %s request file path: %s", connId, conn.RemoteAddr(), localPath)
logger.Debug("request path", "id", connId, "remoteAddr", conn.RemoteAddr(), "local path", localPath)
// Get file/directory information
info, err := os.Stat(localPath)
if errors.Is(err, os.ErrNotExist) || errors.Is(err, os.ErrPermission) {
return []byte("51 not found\r\n"), nil
} else if err != nil {
return nil, errors.NewConnectionError(fmt.Errorf("%s %s failed to access path: %w", connId, conn.RemoteAddr(), err))
if err != nil {
return nil, apperrors.NewGeminiError(fmt.Errorf("failed to access path: %w", err), gemini.StatusNotFound)
}
// Handle directory.
if info.IsDir() {
return generateResponseDir(conn, connId, url, localPath)
return generateResponseDir(ctx, localPath)
}
return generateResponseFile(conn, connId, url, localPath)
return generateResponseFile(ctx, localPath)
}
func generateResponseFile(conn *tls.Conn, connId string, url *common.URL, localPath string) ([]byte, error) {
func generateResponseFile(ctx context.Context, localPath string) ([]byte, error) {
if err := ctx.Err(); err != nil {
return nil, err
}
data, err := os.ReadFile(localPath)
if errors.Is(err, os.ErrNotExist) || errors.Is(err, os.ErrPermission) {
return []byte("51 not found\r\n"), nil
} else if err != nil {
return nil, errors.NewConnectionError(fmt.Errorf("%s %s failed to read file: %w", connId, conn.RemoteAddr(), err))
if err != nil {
return nil, apperrors.NewGeminiError(fmt.Errorf("failed to access path: %w", err), gemini.StatusNotFound)
}
var mimeType string
@@ -64,15 +126,18 @@ func generateResponseFile(conn *tls.Conn, connId string, url *common.URL, localP
} else {
mimeType = mimetype.Detect(data).String()
}
headerBytes := []byte(fmt.Sprintf("20 %s\r\n", mimeType))
headerBytes := []byte(fmt.Sprintf("%d %s; lang=en\r\n", gemini.StatusSuccess, mimeType))
response := append(headerBytes, data...)
return response, nil
}
func generateResponseDir(conn *tls.Conn, connId string, url *common.URL, localPath string) (output []byte, err error) {
func generateResponseDir(ctx context.Context, localPath string) (output []byte, err error) {
if err := ctx.Err(); err != nil {
return nil, err
}
entries, err := os.ReadDir(localPath)
if err != nil {
return nil, errors.NewConnectionError(fmt.Errorf("%s %s failed to read directory: %w", connId, conn.RemoteAddr(), err))
return nil, apperrors.NewGeminiError(fmt.Errorf("failed to access path: %w", err), gemini.StatusNotFound)
}
if config.CONFIG.DirIndexingEnabled {
@@ -80,27 +145,27 @@ func generateResponseDir(conn *tls.Conn, connId string, url *common.URL, localPa
contents = append(contents, "Directory index:\n\n")
contents = append(contents, "=> ../\n")
for _, entry := range entries {
// URL-encode entry names for safety
safeName := url.PathEscape(entry.Name())
if entry.IsDir() {
contents = append(contents, fmt.Sprintf("=> %s/\n", entry.Name()))
contents = append(contents, fmt.Sprintf("=> %s/\n", safeName))
} else {
contents = append(contents, fmt.Sprintf("=> %s\n", entry.Name()))
contents = append(contents, fmt.Sprintf("=> %s\n", safeName))
}
}
data := []byte(strings.Join(contents, ""))
headerBytes := []byte("20 text/gemini;\r\n")
headerBytes := []byte(fmt.Sprintf("%d text/gemini; lang=en\r\n", gemini.StatusSuccess))
response := append(headerBytes, data...)
return response, nil
} else {
filePath := path.Join(localPath, "index.gmi")
return generateResponseFile(conn, connId, url, filePath)
}
filePath := filepath.Join(localPath, "index.gmi")
return generateResponseFile(ctx, filePath)
}
func calculateLocalPath(input string, basePath string) (string, error) {
// Check for invalid characters early
if strings.ContainsAny(input, "\\") {
return "", errors.NewError(fmt.Errorf("invalid characters in path: %s", input))
return "", apperrors.NewGeminiError(fmt.Errorf("invalid characters in path: %s", input), gemini.StatusBadRequest)
}
// If IsLocal(path) returns true, then Join(base, path)
@@ -116,9 +181,116 @@ func calculateLocalPath(input string, basePath string) (string, error) {
localPath, err := filepath.Localize(filePath)
if err != nil || !filepath.IsLocal(localPath) {
return "", errors.NewError(fmt.Errorf("could not construct local path from %s: %s", input, err))
return "", apperrors.NewGeminiError(fmt.Errorf("could not construct local path from %s: %s", input, err), gemini.StatusBadRequest)
}
filePath = path.Join(basePath, localPath)
return filePath, nil
}
func HandleConnection(ctx context.Context, conn *tls.Conn) (err error) {
logger := logging.FromContext(ctx)
start := time.Now()
var outputBytes []byte
// Set connection deadline based on context
if deadline, ok := ctx.Deadline(); ok {
_ = conn.SetDeadline(deadline)
}
defer func(conn *tls.Conn) {
end := time.Now()
tookMs := end.Sub(start).Milliseconds()
var responseHeader string
// On non-errors, just log response and close connection.
if err == nil {
// Log non-erroneous responses
if i := bytes.Index(outputBytes, []byte{'\r'}); i >= 0 {
responseHeader = string(outputBytes[:i])
}
logger.Debug("Response", "responseHeader", responseHeader, "ms", tookMs)
_ = CloseConnection(conn)
return
}
// Handle context cancellation/timeout
if errors.Is(err, context.DeadlineExceeded) {
logger.Info("Connection timeout", "ms", tookMs)
responseHeader = fmt.Sprintf("%d Request timeout", gemini.StatusCGIError)
_, _ = conn.Write([]byte(responseHeader + "\r\n"))
_ = CloseConnection(conn)
return
}
if errors.Is(err, context.Canceled) {
logger.Info("Connection cancelled", "ms", tookMs)
_ = CloseConnection(conn)
return
}
var code int
var responseMsg string
if apperrors.IsFatal(err) {
_ = CloseConnection(conn)
return
}
if apperrors.IsGeminiError(err) {
code = apperrors.GetStatusCode(err)
responseMsg = "server error"
} else {
code = gemini.StatusPermanentFailure
responseMsg = "server error"
}
responseHeader = fmt.Sprintf("%d %s", code, responseMsg)
_, _ = conn.Write([]byte(responseHeader + "\r\n"))
_ = CloseConnection(conn)
}(conn)
// Check context before starting
if err := ctx.Err(); err != nil {
return err
}
// Gemini is supposed to have a 1kb limit
// on input requests.
buffer := make([]byte, 1025)
n, err := conn.Read(buffer)
if err != nil && err != io.EOF {
return apperrors.NewGeminiError(fmt.Errorf("failed to read connection data: %w", err), gemini.StatusBadRequest)
}
if n == 0 {
return apperrors.NewGeminiError(fmt.Errorf("client did not send data"), gemini.StatusBadRequest)
}
if n > 1024 {
return apperrors.NewGeminiError(fmt.Errorf("client request size %d > 1024 bytes", n), gemini.StatusBadRequest)
}
// Check context after read
if err := ctx.Err(); err != nil {
return err
}
dataBytes := buffer[:n]
dataString := string(dataBytes)
logger.Info("Request", "data", strings.TrimSpace(dataString), "size", len(dataBytes))
outputBytes, err = GenerateResponse(ctx, conn, dataString)
if len(outputBytes) > config.CONFIG.MaxResponseSize {
return apperrors.NewGeminiError(fmt.Errorf("max response size reached"), gemini.StatusTemporaryFailure)
}
if err != nil {
return err
}
// Check context before write
if err := ctx.Err(); err != nil {
return err
}
_, err = conn.Write(outputBytes)
if err != nil {
return err
}
return nil
}

3
srv/index.gmi Normal file
View File

@@ -0,0 +1,3 @@
# Hello world!
This is a test gemini file.

5
vendor/git.antanst.com/antanst/uid/README.md generated vendored Normal file
View File

@@ -0,0 +1,5 @@
# UID
This project generates a reasonably secure and convenient UID.
Borrows code from https://github.com/matoous/go-nanoid

68
vendor/git.antanst.com/antanst/uid/uid.go generated vendored Normal file
View File

@@ -0,0 +1,68 @@
package uid
import (
"crypto/rand"
"errors"
"math"
)
// UID is a high level function that returns a reasonably secure UID.
// Only alphanumeric characters except 'o','O' and 'l'
func UID() string {
id, err := Generate("abcdefghijkmnpqrstuvwxyzABCDEFGHIJKLMNPQRSTUVWXYZ0123456789", 20)
if err != nil {
panic(err)
}
return id
}
// Generate is a low-level function to change alphabet and ID size.
// Taken from go-nanoid project
func Generate(alphabet string, size int) (string, error) {
chars := []rune(alphabet)
if len(alphabet) == 0 || len(alphabet) > 255 {
return "", errors.New("alphabet must not be empty and contain no more than 255 chars")
}
if size <= 0 {
return "", errors.New("size must be positive integer")
}
mask := getMask(len(chars))
// estimate how many random bytes we will need for the ID, we might actually need more but this is tradeoff
// between average case and worst case
ceilArg := 1.6 * float64(mask*size) / float64(len(alphabet))
step := int(math.Ceil(ceilArg))
id := make([]rune, size)
bytes := make([]byte, step)
for j := 0; ; {
_, err := rand.Read(bytes)
if err != nil {
return "", err
}
for i := 0; i < step; i++ {
currByte := bytes[i] & byte(mask)
if currByte < byte(len(chars)) {
id[j] = chars[currByte]
j++
if j == size {
return string(id[:size]), nil
}
}
}
}
}
// getMask generates bit mask used to obtain bits from the random bytes that are used to get index of random character
// from the alphabet. Example: if the alphabet has 6 = (110)_2 characters it is sufficient to use mask 7 = (111)_2
// Taken from go-nanoid project
func getMask(alphabetSize int) int {
for i := 1; i <= 8; i++ {
mask := (2 << uint(i)) - 1
if mask >= alphabetSize-1 {
return mask
}
}
return 0
}

View File

@@ -0,0 +1 @@
testdata/* linguist-vendored

View File

@@ -0,0 +1,5 @@
version: "2"
linters:
exclusions:
presets:
- std-error-handling

21
vendor/github.com/gabriel-vasile/mimetype/LICENSE generated vendored Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2018 Gabriel Vasile
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

103
vendor/github.com/gabriel-vasile/mimetype/README.md generated vendored Normal file
View File

@@ -0,0 +1,103 @@
<h1 align="center">
mimetype
</h1>
<h4 align="center">
A package for detecting MIME types and extensions based on magic numbers
</h4>
<h6 align="center">
Goroutine safe, extensible, no C bindings
</h6>
<p align="center">
<a href="https://pkg.go.dev/github.com/gabriel-vasile/mimetype">
<img alt="Go Reference" src="https://pkg.go.dev/badge/github.com/gabriel-vasile/mimetype.svg">
</a>
<a href="https://goreportcard.com/report/github.com/gabriel-vasile/mimetype">
<img alt="Go report card" src="https://goreportcard.com/badge/github.com/gabriel-vasile/mimetype">
</a>
<a href="LICENSE">
<img alt="License" src="https://img.shields.io/badge/License-MIT-green.svg">
</a>
</p>
## Features
- fast and precise MIME type and file extension detection
- long list of [supported MIME types](supported_mimes.md)
- possibility to [extend](https://pkg.go.dev/github.com/gabriel-vasile/mimetype#example-package-Extend) with other file formats
- common file formats are prioritized
- [text vs. binary files differentiation](https://pkg.go.dev/github.com/gabriel-vasile/mimetype#example-package-TextVsBinary)
- no external dependencies
- safe for concurrent usage
## Install
```bash
go get github.com/gabriel-vasile/mimetype
```
## Usage
```go
mtype := mimetype.Detect([]byte)
// OR
mtype, err := mimetype.DetectReader(io.Reader)
// OR
mtype, err := mimetype.DetectFile("/path/to/file")
fmt.Println(mtype.String(), mtype.Extension())
```
See the [runnable Go Playground examples](https://pkg.go.dev/github.com/gabriel-vasile/mimetype#pkg-overview).
Caution: only use libraries like **mimetype** as a last resort. Content type detection
using magic numbers is slow, inaccurate, and non-standard. Most of the times
protocols have methods for specifying such metadata; e.g., `Content-Type` header
in HTTP and SMTP.
## FAQ
Q: My file is in the list of [supported MIME types](supported_mimes.md) but
it is not correctly detected. What should I do?
A: Some file formats (often Microsoft Office documents) keep their signatures
towards the end of the file. Try increasing the number of bytes used for detection
with:
```go
mimetype.SetLimit(1024*1024) // Set limit to 1MB.
// or
mimetype.SetLimit(0) // No limit, whole file content used.
mimetype.DetectFile("file.doc")
```
If increasing the limit does not help, please
[open an issue](https://github.com/gabriel-vasile/mimetype/issues/new?assignees=&labels=&template=mismatched-mime-type-detected.md&title=).
## Tests
In addition to unit tests,
[mimetype_tests](https://github.com/gabriel-vasile/mimetype_tests) compares the
library with the [Unix file utility](https://en.wikipedia.org/wiki/File_(command))
for around 50 000 sample files. Check the latest comparison results
[here](https://github.com/gabriel-vasile/mimetype_tests/actions).
## Benchmarks
Benchmarks for each file format are performed when a PR is open. The results can
be seen on the [workflows page](https://github.com/gabriel-vasile/mimetype/actions/workflows/benchmark.yml).
Performance improvements are welcome but correctness is prioritized.
## Structure
**mimetype** uses a hierarchical structure to keep the MIME type detection logic.
This reduces the number of calls needed for detecting the file type. The reason
behind this choice is that there are file formats used as containers for other
file formats. For example, Microsoft Office files are just zip archives,
containing specific metadata files. Once a file has been identified as a
zip, there is no need to check if it is a text file, but it is worth checking if
it is an Microsoft Office file.
To prevent loading entire files into memory, when detecting from a
[reader](https://pkg.go.dev/github.com/gabriel-vasile/mimetype#DetectReader)
or from a [file](https://pkg.go.dev/github.com/gabriel-vasile/mimetype#DetectFile)
**mimetype** limits itself to reading only the header of the input.
<div align="center">
<img alt="how project is structured" src="https://raw.githubusercontent.com/gabriel-vasile/mimetype/master/testdata/gif.gif" width="88%">
</div>
## Contributing
Contributions are unexpected but welcome. When submitting a PR for detection of
a new file format, please make sure to add a record to the list of testcases
from [mimetype_test.go](mimetype_test.go). For complex files a record can be added
in the [testdata](testdata) directory.

View File

@@ -0,0 +1,283 @@
package charset
import (
"bytes"
"unicode/utf8"
"github.com/gabriel-vasile/mimetype/internal/markup"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
const (
F = 0 /* character never appears in text */
T = 1 /* character appears in plain ASCII text */
I = 2 /* character appears in ISO-8859 text */
X = 3 /* character appears in non-ISO extended ASCII (Mac, IBM PC) */
)
var (
boms = []struct {
bom []byte
enc string
}{
{[]byte{0xEF, 0xBB, 0xBF}, "utf-8"},
{[]byte{0x00, 0x00, 0xFE, 0xFF}, "utf-32be"},
{[]byte{0xFF, 0xFE, 0x00, 0x00}, "utf-32le"},
{[]byte{0xFE, 0xFF}, "utf-16be"},
{[]byte{0xFF, 0xFE}, "utf-16le"},
}
// https://github.com/file/file/blob/fa93fb9f7d21935f1c7644c47d2975d31f12b812/src/encoding.c#L241
textChars = [256]byte{
/* BEL BS HT LF VT FF CR */
F, F, F, F, F, F, F, T, T, T, T, T, T, T, F, F, /* 0x0X */
/* ESC */
F, F, F, F, F, F, F, F, F, F, F, T, F, F, F, F, /* 0x1X */
T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, /* 0x2X */
T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, /* 0x3X */
T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, /* 0x4X */
T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, /* 0x5X */
T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, /* 0x6X */
T, T, T, T, T, T, T, T, T, T, T, T, T, T, T, F, /* 0x7X */
/* NEL */
X, X, X, X, X, T, X, X, X, X, X, X, X, X, X, X, /* 0x8X */
X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, /* 0x9X */
I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, /* 0xaX */
I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, /* 0xbX */
I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, /* 0xcX */
I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, /* 0xdX */
I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, /* 0xeX */
I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, I, /* 0xfX */
}
)
// FromBOM returns the charset declared in the BOM of content.
func FromBOM(content []byte) string {
for _, b := range boms {
if bytes.HasPrefix(content, b.bom) {
return b.enc
}
}
return ""
}
// FromPlain returns the charset of a plain text. It relies on BOM presence
// and it falls back on checking each byte in content.
func FromPlain(content []byte) string {
if len(content) == 0 {
return ""
}
if cset := FromBOM(content); cset != "" {
return cset
}
origContent := content
// Try to detect UTF-8.
// First eliminate any partial rune at the end.
for i := len(content) - 1; i >= 0 && i > len(content)-4; i-- {
b := content[i]
if b < 0x80 {
break
}
if utf8.RuneStart(b) {
content = content[:i]
break
}
}
hasHighBit := false
for _, c := range content {
if c >= 0x80 {
hasHighBit = true
break
}
}
if hasHighBit && utf8.Valid(content) {
return "utf-8"
}
// ASCII is a subset of UTF8. Follow W3C recommendation and replace with UTF8.
if ascii(origContent) {
return "utf-8"
}
return latin(origContent)
}
func latin(content []byte) string {
hasControlBytes := false
for _, b := range content {
t := textChars[b]
if t != T && t != I {
return ""
}
if b >= 0x80 && b <= 0x9F {
hasControlBytes = true
}
}
// Code range 0x80 to 0x9F is reserved for control characters in ISO-8859-1
// (so-called C1 Controls). Windows 1252, however, has printable punctuation
// characters in this range.
if hasControlBytes {
return "windows-1252"
}
return "iso-8859-1"
}
func ascii(content []byte) bool {
for _, b := range content {
if textChars[b] != T {
return false
}
}
return true
}
// FromXML returns the charset of an XML document. It relies on the XML
// header <?xml version="1.0" encoding="UTF-8"?> and falls back on the plain
// text content.
func FromXML(content []byte) string {
if cset := fromXML(content); cset != "" {
return cset
}
return FromPlain(content)
}
func fromXML(s scan.Bytes) string {
xml := []byte("<?XML")
lxml := len(xml)
for {
if len(s) == 0 {
return ""
}
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
if len(s) <= lxml {
return ""
}
if !s.Match(xml, scan.IgnoreCase) {
s = s[1:] // safe to slice instead of s.Advance(1) because bounds are checked
continue
}
aName, aVal, hasMore := "", "", true
for hasMore {
aName, aVal, hasMore = markup.GetAnAttribute(&s)
if aName == "encoding" && aVal != "" {
return aVal
}
}
}
}
// FromHTML returns the charset of an HTML document. It first looks if a BOM is
// present and if so uses it to determine the charset. If no BOM is present,
// it relies on the meta tag <meta charset="UTF-8"> and falls back on the
// plain text content.
func FromHTML(content []byte) string {
if cset := FromBOM(content); cset != "" {
return cset
}
if cset := fromHTML(content); cset != "" {
return cset
}
return FromPlain(content)
}
func fromHTML(s scan.Bytes) string {
const (
dontKnow = iota
doNeedPragma
doNotNeedPragma
)
meta := []byte("<META")
body := []byte("<BODY")
lmeta := len(meta)
for {
if markup.SkipAComment(&s) {
continue
}
if len(s) <= lmeta {
return ""
}
// Abort when <body is reached.
if s.Match(body, scan.IgnoreCase) {
return ""
}
if !s.Match(meta, scan.IgnoreCase) {
s = s[1:] // safe to slice instead of s.Advance(1) because bounds are checked
continue
}
s = s[lmeta:]
c := s.Pop()
if c == 0 || (!scan.ByteIsWS(c) && c != '/') {
return ""
}
attrList := make(map[string]bool)
gotPragma := false
needPragma := dontKnow
charset := ""
aName, aVal, hasMore := "", "", true
for hasMore {
aName, aVal, hasMore = markup.GetAnAttribute(&s)
if attrList[aName] {
continue
}
// processing step
if len(aName) == 0 && len(aVal) == 0 {
if needPragma == dontKnow {
continue
}
if needPragma == doNeedPragma && !gotPragma {
continue
}
}
attrList[aName] = true
if aName == "http-equiv" && scan.Bytes(aVal).Match([]byte("CONTENT-TYPE"), scan.IgnoreCase) {
gotPragma = true
} else if aName == "content" {
charset = string(extractCharsetFromMeta(scan.Bytes(aVal)))
if len(charset) != 0 {
needPragma = doNeedPragma
}
} else if aName == "charset" {
charset = aVal
needPragma = doNotNeedPragma
}
}
if needPragma == dontKnow || needPragma == doNeedPragma && !gotPragma {
continue
}
return charset
}
}
// https://html.spec.whatwg.org/multipage/urls-and-fetching.html#algorithm-for-extracting-a-character-encoding-from-a-meta-element
func extractCharsetFromMeta(s scan.Bytes) []byte {
for {
i := bytes.Index(s, []byte("charset"))
if i == -1 {
return nil
}
s.Advance(i + len("charset"))
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
if s.Pop() != '=' {
continue
}
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
quote := s.Peek()
if quote == 0 {
return nil
}
if quote == '"' || quote == '\'' {
s.Advance(1)
return bytes.TrimSpace(s.PopUntil(quote))
}
return bytes.TrimSpace(s.PopUntil(';', '\t', '\n', '\x0c', '\r', ' '))
}
}

View File

@@ -0,0 +1,125 @@
package csv
import (
"bytes"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
// Parser is a CSV reader that only counts fields.
// It avoids allocating/copying memory and to verify behaviour, it is tested
// and fuzzed against encoding/csv parser.
type Parser struct {
comma byte
comment byte
s scan.Bytes
}
func NewParser(comma, comment byte, s scan.Bytes) *Parser {
return &Parser{
comma: comma,
comment: comment,
s: s,
}
}
func (r *Parser) readLine() (line []byte, cutShort bool) {
line = r.s.ReadSlice('\n')
n := len(line)
if n > 0 && line[n-1] == '\r' {
return line[:n-1], false // drop \r at end of line
}
// This line is problematic. The logic from CountFields comes from
// encoding/csv.Reader which relies on mutating the input bytes.
// https://github.com/golang/go/blob/b3251514531123d7fd007682389bce7428d159a0/src/encoding/csv/reader.go#L275-L279
// To avoid mutating the input, we return cutShort. #680
if n >= 2 && line[n-2] == '\r' && line[n-1] == '\n' {
return line[:n-2], true
}
return line, false
}
// CountFields reads one CSV line and counts how many records that line contained.
// hasMore reports whether there are more lines in the input.
// collectIndexes makes CountFields return a list of indexes where CSV fields
// start in the line. These indexes are used to test the correctness against the
// encoding/csv parser.
func (r *Parser) CountFields(collectIndexes bool) (fields int, fieldPos []int, hasMore bool) {
finished := false
var line scan.Bytes
cutShort := false
for {
line, cutShort = r.readLine()
if finished {
return 0, nil, false
}
finished = len(r.s) == 0 && len(line) == 0
if len(line) == lengthNL(line) {
line = nil
continue // Skip empty lines.
}
if len(line) > 0 && line[0] == r.comment {
line = nil
continue
}
break
}
indexes := []int{}
originalLine := line
parseField:
for {
if len(line) == 0 || line[0] != '"' { // non-quoted string field
fields++
if collectIndexes {
indexes = append(indexes, len(originalLine)-len(line))
}
i := bytes.IndexByte(line, r.comma)
if i >= 0 {
line.Advance(i + 1) // 1 to get over ending comma
continue parseField
}
break parseField
} else { // Quoted string field.
if collectIndexes {
indexes = append(indexes, len(originalLine)-len(line))
}
line.Advance(1) // get over starting quote
for {
i := bytes.IndexByte(line, '"')
if i >= 0 {
line.Advance(i + 1) // 1 for ending quote
switch rn := line.Peek(); {
case rn == '"':
line.Advance(1)
case rn == r.comma:
line.Advance(1)
fields++
continue parseField
case lengthNL(line) == len(line):
fields++
break parseField
}
} else if len(line) > 0 || cutShort {
line, cutShort = r.readLine()
originalLine = line
} else {
fields++
break parseField
}
}
}
}
return fields, indexes, fields != 0
}
// lengthNL reports the number of bytes for the trailing \n.
func lengthNL(b []byte) int {
if len(b) > 0 && b[len(b)-1] == '\n' {
return 1
}
return 0
}

View File

@@ -0,0 +1,478 @@
package json
import (
"bytes"
"sync"
)
const (
QueryNone = "json"
QueryGeo = "geo"
QueryHAR = "har"
QueryGLTF = "gltf"
maxRecursion = 4096
)
var queries = map[string][]query{
QueryNone: nil,
QueryGeo: {{
SearchPath: [][]byte{[]byte("type")},
SearchVals: [][]byte{
[]byte(`"Feature"`),
[]byte(`"FeatureCollection"`),
[]byte(`"Point"`),
[]byte(`"LineString"`),
[]byte(`"Polygon"`),
[]byte(`"MultiPoint"`),
[]byte(`"MultiLineString"`),
[]byte(`"MultiPolygon"`),
[]byte(`"GeometryCollection"`),
},
}},
QueryHAR: {{
SearchPath: [][]byte{[]byte("log"), []byte("version")},
}, {
SearchPath: [][]byte{[]byte("log"), []byte("creator")},
}, {
SearchPath: [][]byte{[]byte("log"), []byte("entries")},
}},
QueryGLTF: {{
SearchPath: [][]byte{[]byte("asset"), []byte("version")},
SearchVals: [][]byte{[]byte(`"1.0"`), []byte(`"2.0"`)},
}},
}
var parserPool = sync.Pool{
New: func() any {
return &parserState{maxRecursion: maxRecursion}
},
}
// parserState holds the state of JSON parsing. The number of inspected bytes,
// the current path inside the JSON object, etc.
type parserState struct {
// ib represents the number of inspected bytes.
// Because mimetype limits itself to only reading the header of the file,
// it means sometimes the input JSON can be truncated. In that case, we want
// to still detect it as JSON, even if it's invalid/truncated.
// When ib == len(input) it means the JSON was valid (at least the header).
ib int
maxRecursion int
// currPath keeps a track of the JSON keys parsed up.
// It works only for JSON objects. JSON arrays are ignored
// mainly because the functionality is not needed.
currPath [][]byte
// firstToken stores the first JSON token encountered in input.
// TODO: performance would be better if we would stop parsing as soon
// as we see that first token is not what we are interested in.
firstToken int
// querySatisfied is true if both path and value of any queries passed to
// consumeAny are satisfied.
querySatisfied bool
}
// query holds information about a combination of {"key": "val"} that we're trying
// to search for inside the JSON.
type query struct {
// SearchPath represents the whole path to look for inside the JSON.
// ex: [][]byte{[]byte("foo"), []byte("bar")} matches {"foo": {"bar": "baz"}}
SearchPath [][]byte
// SearchVals represents values to look for when the SearchPath is found.
// Each SearchVal element is tried until one of them matches (logical OR.)
SearchVals [][]byte
}
func eq(path1, path2 [][]byte) bool {
if len(path1) != len(path2) {
return false
}
for i := range path1 {
if !bytes.Equal(path1[i], path2[i]) {
return false
}
}
return true
}
// LooksLikeObjectOrArray reports if first non white space character from raw
// is either { or [. Parsing raw as JSON is a heavy operation. When receiving some
// text input we can skip parsing if the input does not even look like JSON.
func LooksLikeObjectOrArray(raw []byte) bool {
for i := range raw {
if isSpace(raw[i]) {
continue
}
return raw[i] == '{' || raw[i] == '['
}
return false
}
// Parse will take out a parser from the pool depending on queryType and tries
// to parse raw bytes as JSON.
func Parse(queryType string, raw []byte) (parsed, inspected, firstToken int, querySatisfied bool) {
p := parserPool.Get().(*parserState)
defer func() {
// Avoid hanging on to too much memory in extreme input cases.
if len(p.currPath) > 128 {
p.currPath = nil
}
parserPool.Put(p)
}()
p.reset()
qs := queries[queryType]
got := p.consumeAny(raw, qs, 0)
return got, p.ib, p.firstToken, p.querySatisfied
}
func (p *parserState) reset() {
p.ib = 0
p.currPath = p.currPath[0:0]
p.firstToken = TokInvalid
p.querySatisfied = false
}
func (p *parserState) consumeSpace(b []byte) (n int) {
for len(b) > 0 && isSpace(b[0]) {
b = b[1:]
n++
p.ib++
}
return n
}
func (p *parserState) consumeConst(b, cnst []byte) int {
lb := len(b)
for i, c := range cnst {
if lb > i && b[i] == c {
p.ib++
} else {
return 0
}
}
return len(cnst)
}
func (p *parserState) consumeString(b []byte) (n int) {
var c byte
for len(b[n:]) > 0 {
c, n = b[n], n+1
p.ib++
switch c {
case '\\':
if len(b[n:]) == 0 {
return 0
}
switch b[n] {
case '"', '\\', '/', 'b', 'f', 'n', 'r', 't':
n++
p.ib++
continue
case 'u':
n++
p.ib++
for j := 0; j < 4 && len(b[n:]) > 0; j++ {
if !isXDigit(b[n]) {
return 0
}
n++
p.ib++
}
continue
default:
return 0
}
case '"':
return n
default:
continue
}
}
return 0
}
func (p *parserState) consumeNumber(b []byte) (n int) {
got := false
var i int
if len(b) == 0 {
goto out
}
if b[0] == '-' {
b, i = b[1:], i+1
p.ib++
}
for len(b) > 0 {
if !isDigit(b[0]) {
break
}
got = true
b, i = b[1:], i+1
p.ib++
}
if len(b) == 0 {
goto out
}
if b[0] == '.' {
b, i = b[1:], i+1
p.ib++
}
for len(b) > 0 {
if !isDigit(b[0]) {
break
}
got = true
b, i = b[1:], i+1
p.ib++
}
if len(b) == 0 {
goto out
}
if got && (b[0] == 'e' || b[0] == 'E') {
b, i = b[1:], i+1
p.ib++
got = false
if len(b) == 0 {
goto out
}
if b[0] == '+' || b[0] == '-' {
b, i = b[1:], i+1
p.ib++
}
for len(b) > 0 {
if !isDigit(b[0]) {
break
}
got = true
b, i = b[1:], i+1
p.ib++
}
}
out:
if got {
return i
}
return 0
}
func (p *parserState) consumeArray(b []byte, qs []query, lvl int) (n int) {
p.appendPath([]byte{'['}, qs)
if len(b) == 0 {
return 0
}
for n < len(b) {
n += p.consumeSpace(b[n:])
if len(b[n:]) == 0 {
return 0
}
if b[n] == ']' {
p.ib++
p.popLastPath(qs)
return n + 1
}
innerParsed := p.consumeAny(b[n:], qs, lvl)
if innerParsed == 0 {
return 0
}
n += innerParsed
if len(b[n:]) == 0 {
return 0
}
switch b[n] {
case ',':
n += 1
p.ib++
continue
case ']':
p.ib++
return n + 1
default:
return 0
}
}
return 0
}
func queryPathMatch(qs []query, path [][]byte) int {
for i := range qs {
if eq(qs[i].SearchPath, path) {
return i
}
}
return -1
}
// appendPath will append a path fragment if queries is not empty.
// If we don't need query functionality (just checking if a JSON is valid),
// then we can skip keeping track of the path we're currently in.
func (p *parserState) appendPath(path []byte, qs []query) {
if len(qs) != 0 {
p.currPath = append(p.currPath, path)
}
}
func (p *parserState) popLastPath(qs []query) {
if len(qs) != 0 {
p.currPath = p.currPath[:len(p.currPath)-1]
}
}
func (p *parserState) consumeObject(b []byte, qs []query, lvl int) (n int) {
for n < len(b) {
n += p.consumeSpace(b[n:])
if len(b[n:]) == 0 {
return 0
}
if b[n] == '}' {
p.ib++
return n + 1
}
if b[n] != '"' {
return 0
} else {
n += 1
p.ib++
}
// queryMatched stores the index of the query satisfying the current path.
queryMatched := -1
if keyLen := p.consumeString(b[n:]); keyLen == 0 {
return 0
} else {
p.appendPath(b[n:n+keyLen-1], qs)
if !p.querySatisfied {
queryMatched = queryPathMatch(qs, p.currPath)
}
n += keyLen
}
n += p.consumeSpace(b[n:])
if len(b[n:]) == 0 {
return 0
}
if b[n] != ':' {
return 0
} else {
n += 1
p.ib++
}
n += p.consumeSpace(b[n:])
if len(b[n:]) == 0 {
return 0
}
if valLen := p.consumeAny(b[n:], qs, lvl); valLen == 0 {
return 0
} else {
if queryMatched != -1 {
q := qs[queryMatched]
if len(q.SearchVals) == 0 {
p.querySatisfied = true
}
for _, val := range q.SearchVals {
if bytes.Equal(val, bytes.TrimSpace(b[n:n+valLen])) {
p.querySatisfied = true
}
}
}
n += valLen
}
if len(b[n:]) == 0 {
return 0
}
switch b[n] {
case ',':
p.popLastPath(qs)
n++
p.ib++
continue
case '}':
p.popLastPath(qs)
p.ib++
return n + 1
default:
return 0
}
}
return 0
}
func (p *parserState) consumeAny(b []byte, qs []query, lvl int) (n int) {
// Avoid too much recursion.
if p.maxRecursion != 0 && lvl > p.maxRecursion {
return 0
}
if len(qs) == 0 {
p.querySatisfied = true
}
n += p.consumeSpace(b)
if len(b[n:]) == 0 {
return 0
}
var t, rv int
switch b[n] {
case '"':
n++
p.ib++
rv = p.consumeString(b[n:])
t = TokString
case '[':
n++
p.ib++
rv = p.consumeArray(b[n:], qs, lvl+1)
t = TokArray
case '{':
n++
p.ib++
rv = p.consumeObject(b[n:], qs, lvl+1)
t = TokObject
case 't':
rv = p.consumeConst(b[n:], []byte("true"))
t = TokTrue
case 'f':
rv = p.consumeConst(b[n:], []byte("false"))
t = TokFalse
case 'n':
rv = p.consumeConst(b[n:], []byte("null"))
t = TokNull
default:
rv = p.consumeNumber(b[n:])
t = TokNumber
}
if lvl == 0 {
p.firstToken = t
}
if rv <= 0 {
return n
}
n += rv
n += p.consumeSpace(b[n:])
return n
}
func isSpace(c byte) bool {
return c == ' ' || c == '\t' || c == '\r' || c == '\n'
}
func isDigit(c byte) bool {
return '0' <= c && c <= '9'
}
func isXDigit(c byte) bool {
if isDigit(c) {
return true
}
return ('a' <= c && c <= 'f') || ('A' <= c && c <= 'F')
}
const (
TokInvalid = 0
TokNull = 1 << iota
TokTrue
TokFalse
TokNumber
TokString
TokArray
TokObject
TokComma
)

View File

@@ -0,0 +1,163 @@
package magic
import (
"bytes"
"encoding/binary"
)
var (
// SevenZ matches a 7z archive.
SevenZ = prefix([]byte{0x37, 0x7A, 0xBC, 0xAF, 0x27, 0x1C})
// Gzip matches gzip files based on http://www.zlib.org/rfc-gzip.html#header-trailer.
Gzip = prefix([]byte{0x1f, 0x8b})
// Fits matches an Flexible Image Transport System file.
Fits = prefix([]byte{
0x53, 0x49, 0x4D, 0x50, 0x4C, 0x45, 0x20, 0x20, 0x3D, 0x20,
0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20,
0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x20, 0x54,
})
// Xar matches an eXtensible ARchive format file.
Xar = prefix([]byte{0x78, 0x61, 0x72, 0x21})
// Bz2 matches a bzip2 file.
Bz2 = prefix([]byte{0x42, 0x5A, 0x68})
// Ar matches an ar (Unix) archive file.
Ar = prefix([]byte{0x21, 0x3C, 0x61, 0x72, 0x63, 0x68, 0x3E})
// Deb matches a Debian package file.
Deb = offset([]byte{
0x64, 0x65, 0x62, 0x69, 0x61, 0x6E, 0x2D,
0x62, 0x69, 0x6E, 0x61, 0x72, 0x79,
}, 8)
// Warc matches a Web ARChive file.
Warc = prefix([]byte("WARC/1.0"), []byte("WARC/1.1"))
// Cab matches a Microsoft Cabinet archive file.
Cab = prefix([]byte("MSCF\x00\x00\x00\x00"))
// Xz matches an xz compressed stream based on https://tukaani.org/xz/xz-file-format.txt.
Xz = prefix([]byte{0xFD, 0x37, 0x7A, 0x58, 0x5A, 0x00})
// Lzip matches an Lzip compressed file.
Lzip = prefix([]byte{0x4c, 0x5a, 0x49, 0x50})
// RPM matches an RPM or Delta RPM package file.
RPM = prefix([]byte{0xed, 0xab, 0xee, 0xdb}, []byte("drpm"))
// Cpio matches a cpio archive file.
Cpio = prefix([]byte("070707"), []byte("070701"), []byte("070702"))
// RAR matches a RAR archive file.
RAR = prefix([]byte("Rar!\x1A\x07\x00"), []byte("Rar!\x1A\x07\x01\x00"))
)
// InstallShieldCab matches an InstallShield Cabinet archive file.
func InstallShieldCab(raw []byte, _ uint32) bool {
return len(raw) > 7 &&
bytes.Equal(raw[0:4], []byte("ISc(")) &&
raw[6] == 0 &&
(raw[7] == 1 || raw[7] == 2 || raw[7] == 4)
}
// Zstd matches a Zstandard archive file.
// https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md
func Zstd(raw []byte, limit uint32) bool {
if len(raw) < 4 {
return false
}
sig := binary.LittleEndian.Uint32(raw)
// Check for Zstandard frames and skippable frames.
return (sig >= 0xFD2FB522 && sig <= 0xFD2FB528) ||
(sig >= 0x184D2A50 && sig <= 0x184D2A5F)
}
// CRX matches a Chrome extension file: a zip archive prepended by a package header.
func CRX(raw []byte, limit uint32) bool {
const minHeaderLen = 16
if len(raw) < minHeaderLen || !bytes.HasPrefix(raw, []byte("Cr24")) {
return false
}
pubkeyLen := binary.LittleEndian.Uint32(raw[8:12])
sigLen := binary.LittleEndian.Uint32(raw[12:16])
zipOffset := minHeaderLen + pubkeyLen + sigLen
if uint32(len(raw)) < zipOffset {
return false
}
return Zip(raw[zipOffset:], limit)
}
// Tar matches a (t)ape (ar)chive file.
// Tar files are divided into 512 bytes records. First record contains a 257
// bytes header padded with NUL.
func Tar(raw []byte, _ uint32) bool {
const sizeRecord = 512
// The structure of a tar header:
// type TarHeader struct {
// Name [100]byte
// Mode [8]byte
// Uid [8]byte
// Gid [8]byte
// Size [12]byte
// Mtime [12]byte
// Chksum [8]byte
// Linkflag byte
// Linkname [100]byte
// Magic [8]byte
// Uname [32]byte
// Gname [32]byte
// Devmajor [8]byte
// Devminor [8]byte
// }
if len(raw) < sizeRecord {
return false
}
raw = raw[:sizeRecord]
// First 100 bytes of the header represent the file name.
// Check if file looks like Gentoo GLEP binary package.
if bytes.Contains(raw[:100], []byte("/gpkg-1\x00")) {
return false
}
// Get the checksum recorded into the file.
recsum := tarParseOctal(raw[148:156])
if recsum == -1 {
return false
}
sum1, sum2 := tarChksum(raw)
return recsum == sum1 || recsum == sum2
}
// tarParseOctal converts octal string to decimal int.
func tarParseOctal(b []byte) int64 {
// Because unused fields are filled with NULs, we need to skip leading NULs.
// Fields may also be padded with spaces or NULs.
// So we remove leading and trailing NULs and spaces to be sure.
b = bytes.Trim(b, " \x00")
if len(b) == 0 {
return -1
}
ret := int64(0)
for _, b := range b {
if b == 0 {
break
}
if b < '0' || b > '7' {
return -1
}
ret = (ret << 3) | int64(b-'0')
}
return ret
}
// tarChksum computes the checksum for the header block b.
// The actual checksum is written to same b block after it has been calculated.
// Before calculation the bytes from b reserved for checksum have placeholder
// value of ASCII space 0x20.
// POSIX specifies a sum of the unsigned byte values, but the Sun tar used
// signed byte values. We compute and return both.
func tarChksum(b []byte) (unsigned, signed int64) {
for i, c := range b {
if 148 <= i && i < 156 {
c = ' ' // Treat the checksum field itself as all spaces.
}
unsigned += int64(c)
signed += int64(int8(c))
}
return unsigned, signed
}

View File

@@ -0,0 +1,76 @@
package magic
import (
"bytes"
"encoding/binary"
)
var (
// Flac matches a Free Lossless Audio Codec file.
Flac = prefix([]byte("\x66\x4C\x61\x43\x00\x00\x00\x22"))
// Midi matches a Musical Instrument Digital Interface file.
Midi = prefix([]byte("\x4D\x54\x68\x64"))
// Ape matches a Monkey's Audio file.
Ape = prefix([]byte("\x4D\x41\x43\x20\x96\x0F\x00\x00\x34\x00\x00\x00\x18\x00\x00\x00\x90\xE3"))
// MusePack matches a Musepack file.
MusePack = prefix([]byte("MPCK"))
// Au matches a Sun Microsystems au file.
Au = prefix([]byte("\x2E\x73\x6E\x64"))
// Amr matches an Adaptive Multi-Rate file.
Amr = prefix([]byte("\x23\x21\x41\x4D\x52"))
// Voc matches a Creative Voice file.
Voc = prefix([]byte("Creative Voice File"))
// M3u matches a Playlist file.
M3u = prefix([]byte("#EXTM3U"))
// AAC matches an Advanced Audio Coding file.
AAC = prefix([]byte{0xFF, 0xF1}, []byte{0xFF, 0xF9})
)
// Mp3 matches an mp3 file.
func Mp3(raw []byte, limit uint32) bool {
if len(raw) < 3 {
return false
}
if bytes.HasPrefix(raw, []byte("ID3")) {
// MP3s with an ID3v2 tag will start with "ID3"
// ID3v1 tags, however appear at the end of the file.
return true
}
// Match MP3 files without tags
switch binary.BigEndian.Uint16(raw[:2]) & 0xFFFE {
case 0xFFFA:
// MPEG ADTS, layer III, v1
return true
case 0xFFF2:
// MPEG ADTS, layer III, v2
return true
case 0xFFE2:
// MPEG ADTS, layer III, v2.5
return true
}
return false
}
// Wav matches a Waveform Audio File Format file.
func Wav(raw []byte, limit uint32) bool {
return len(raw) > 12 &&
bytes.Equal(raw[:4], []byte("RIFF")) &&
bytes.Equal(raw[8:12], []byte{0x57, 0x41, 0x56, 0x45})
}
// Aiff matches Audio Interchange File Format file.
func Aiff(raw []byte, limit uint32) bool {
return len(raw) > 12 &&
bytes.Equal(raw[:4], []byte{0x46, 0x4F, 0x52, 0x4D}) &&
bytes.Equal(raw[8:12], []byte{0x41, 0x49, 0x46, 0x46})
}
// Qcp matches a Qualcomm Pure Voice file.
func Qcp(raw []byte, limit uint32) bool {
return len(raw) > 12 &&
bytes.Equal(raw[:4], []byte("RIFF")) &&
bytes.Equal(raw[8:12], []byte("QLCM"))
}

View File

@@ -0,0 +1,203 @@
package magic
import (
"bytes"
"debug/macho"
"encoding/binary"
)
var (
// Lnk matches Microsoft lnk binary format.
Lnk = prefix([]byte{0x4C, 0x00, 0x00, 0x00, 0x01, 0x14, 0x02, 0x00})
// Wasm matches a web assembly File Format file.
Wasm = prefix([]byte{0x00, 0x61, 0x73, 0x6D})
// Exe matches a Windows/DOS executable file.
Exe = prefix([]byte{0x4D, 0x5A})
// Elf matches an Executable and Linkable Format file.
Elf = prefix([]byte{0x7F, 0x45, 0x4C, 0x46})
// Nes matches a Nintendo Entertainment system ROM file.
Nes = prefix([]byte{0x4E, 0x45, 0x53, 0x1A})
// SWF matches an Adobe Flash swf file.
SWF = prefix([]byte("CWS"), []byte("FWS"), []byte("ZWS"))
// Torrent has bencoded text in the beginning.
Torrent = prefix([]byte("d8:announce"))
// PAR1 matches a parquet file.
Par1 = prefix([]byte{0x50, 0x41, 0x52, 0x31})
// CBOR matches a Concise Binary Object Representation https://cbor.io/
CBOR = prefix([]byte{0xD9, 0xD9, 0xF7})
)
// Java bytecode and Mach-O binaries share the same magic number.
// More info here https://github.com/threatstack/libmagic/blob/master/magic/Magdir/cafebabe
func classOrMachOFat(in []byte) bool {
// There should be at least 8 bytes for both of them because the only way to
// quickly distinguish them is by comparing byte at position 7
if len(in) < 8 {
return false
}
return binary.BigEndian.Uint32(in) == macho.MagicFat
}
// Class matches a java class file.
func Class(raw []byte, limit uint32) bool {
return classOrMachOFat(raw) && raw[7] > 30
}
// MachO matches Mach-O binaries format.
func MachO(raw []byte, limit uint32) bool {
if classOrMachOFat(raw) && raw[7] < 0x14 {
return true
}
if len(raw) < 4 {
return false
}
be := binary.BigEndian.Uint32(raw)
le := binary.LittleEndian.Uint32(raw)
return be == macho.Magic32 ||
le == macho.Magic32 ||
be == macho.Magic64 ||
le == macho.Magic64
}
// Dbf matches a dBase file.
// https://www.dbase.com/Knowledgebase/INT/db7_file_fmt.htm
func Dbf(raw []byte, limit uint32) bool {
if len(raw) < 68 {
return false
}
// 3rd and 4th bytes contain the last update month and day of month.
if raw[2] == 0 || raw[2] > 12 || raw[3] == 0 || raw[3] > 31 {
return false
}
// 12, 13, 30, 31 are reserved bytes and always filled with 0x00.
if raw[12] != 0x00 || raw[13] != 0x00 || raw[30] != 0x00 || raw[31] != 0x00 {
return false
}
// Production MDX flag;
// 0x01 if a production .MDX file exists for this table;
// 0x00 if no .MDX file exists.
if raw[28] > 0x01 {
return false
}
// dbf type is dictated by the first byte.
dbfTypes := []byte{
0x02, 0x03, 0x04, 0x05, 0x30, 0x31, 0x32, 0x42, 0x62, 0x7B, 0x82,
0x83, 0x87, 0x8A, 0x8B, 0x8E, 0xB3, 0xCB, 0xE5, 0xF5, 0xF4, 0xFB,
}
for _, b := range dbfTypes {
if raw[0] == b {
return true
}
}
return false
}
// ElfObj matches an object file.
func ElfObj(raw []byte, limit uint32) bool {
return len(raw) > 17 && ((raw[16] == 0x01 && raw[17] == 0x00) ||
(raw[16] == 0x00 && raw[17] == 0x01))
}
// ElfExe matches an executable file.
func ElfExe(raw []byte, limit uint32) bool {
return len(raw) > 17 && ((raw[16] == 0x02 && raw[17] == 0x00) ||
(raw[16] == 0x00 && raw[17] == 0x02))
}
// ElfLib matches a shared library file.
func ElfLib(raw []byte, limit uint32) bool {
return len(raw) > 17 && ((raw[16] == 0x03 && raw[17] == 0x00) ||
(raw[16] == 0x00 && raw[17] == 0x03))
}
// ElfDump matches a core dump file.
func ElfDump(raw []byte, limit uint32) bool {
return len(raw) > 17 && ((raw[16] == 0x04 && raw[17] == 0x00) ||
(raw[16] == 0x00 && raw[17] == 0x04))
}
// Dcm matches a DICOM medical format file.
func Dcm(raw []byte, limit uint32) bool {
return len(raw) > 131 &&
bytes.Equal(raw[128:132], []byte{0x44, 0x49, 0x43, 0x4D})
}
// Marc matches a MARC21 (MAchine-Readable Cataloging) file.
func Marc(raw []byte, limit uint32) bool {
// File is at least 24 bytes ("leader" field size).
if len(raw) < 24 {
return false
}
// Fixed bytes at offset 20.
if !bytes.Equal(raw[20:24], []byte("4500")) {
return false
}
// First 5 bytes are ASCII digits.
for i := 0; i < 5; i++ {
if raw[i] < '0' || raw[i] > '9' {
return false
}
}
// Field terminator is present in first 2048 bytes.
return bytes.Contains(raw[:min(2048, len(raw))], []byte{0x1E})
}
// GLB matches a glTF model format file.
// GLB is the binary file format representation of 3D models saved in
// the GL transmission Format (glTF).
// GLB uses little endian and its header structure is as follows:
//
// <-- 12-byte header -->
// | magic | version | length |
// | (uint32) | (uint32) | (uint32) |
// | \x67\x6C\x54\x46 | \x01\x00\x00\x00 | ... |
// | g l T F | 1 | ... |
//
// Visit [glTF specification] and [IANA glTF entry] for more details.
//
// [glTF specification]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html
// [IANA glTF entry]: https://www.iana.org/assignments/media-types/model/gltf-binary
var GLB = prefix([]byte("\x67\x6C\x54\x46\x02\x00\x00\x00"),
[]byte("\x67\x6C\x54\x46\x01\x00\x00\x00"))
// TzIf matches a Time Zone Information Format (TZif) file.
// See more: https://tools.ietf.org/id/draft-murchison-tzdist-tzif-00.html#rfc.section.3
// Its header structure is shown below:
//
// +---------------+---+
// | magic (4) | <-+-- version (1)
// +---------------+---+---------------------------------------+
// | [unused - reserved for future use] (15) |
// +---------------+---------------+---------------+-----------+
// | isutccnt (4) | isstdcnt (4) | leapcnt (4) |
// +---------------+---------------+---------------+
// | timecnt (4) | typecnt (4) | charcnt (4) |
func TzIf(raw []byte, limit uint32) bool {
// File is at least 44 bytes (header size).
if len(raw) < 44 {
return false
}
if !bytes.HasPrefix(raw, []byte("TZif")) {
return false
}
// Field "typecnt" MUST not be zero.
if binary.BigEndian.Uint32(raw[36:40]) == 0 {
return false
}
// Version has to be NUL (0x00), '2' (0x32) or '3' (0x33).
return raw[4] == 0x00 || raw[4] == 0x32 || raw[4] == 0x33
}

View File

@@ -0,0 +1,13 @@
package magic
var (
// Sqlite matches an SQLite database file.
Sqlite = prefix([]byte{
0x53, 0x51, 0x4c, 0x69, 0x74, 0x65, 0x20, 0x66,
0x6f, 0x72, 0x6d, 0x61, 0x74, 0x20, 0x33, 0x00,
})
// MsAccessAce matches Microsoft Access dababase file.
MsAccessAce = offset([]byte("Standard ACE DB"), 4)
// MsAccessMdb matches legacy Microsoft Access database file (JET, 2003 and earlier).
MsAccessMdb = offset([]byte("Standard Jet DB"), 4)
)

View File

@@ -0,0 +1,83 @@
package magic
import (
"bytes"
"encoding/binary"
)
var (
// Fdf matches a Forms Data Format file.
Fdf = prefix([]byte("%FDF"))
// Mobi matches a Mobi file.
Mobi = offset([]byte("BOOKMOBI"), 60)
// Lit matches a Microsoft Lit file.
Lit = prefix([]byte("ITOLITLS"))
)
// PDF matches a Portable Document Format file.
// The %PDF- header should be the first thing inside the file but many
// implementations don't follow the rule. The PDF spec at Appendix H says the
// signature can be prepended by anything.
// https://bugs.astron.com/view.php?id=446
func PDF(raw []byte, _ uint32) bool {
raw = raw[:min(len(raw), 1024)]
return bytes.Contains(raw, []byte("%PDF-"))
}
// DjVu matches a DjVu file.
func DjVu(raw []byte, _ uint32) bool {
if len(raw) < 12 {
return false
}
if !bytes.HasPrefix(raw, []byte{0x41, 0x54, 0x26, 0x54, 0x46, 0x4F, 0x52, 0x4D}) {
return false
}
return bytes.HasPrefix(raw[12:], []byte("DJVM")) ||
bytes.HasPrefix(raw[12:], []byte("DJVU")) ||
bytes.HasPrefix(raw[12:], []byte("DJVI")) ||
bytes.HasPrefix(raw[12:], []byte("THUM"))
}
// P7s matches an .p7s signature File (PEM, Base64).
func P7s(raw []byte, _ uint32) bool {
// Check for PEM Encoding.
if bytes.HasPrefix(raw, []byte("-----BEGIN PKCS7")) {
return true
}
// Check if DER Encoding is long enough.
if len(raw) < 20 {
return false
}
// Magic Bytes for the signedData ASN.1 encoding.
startHeader := [][]byte{{0x30, 0x80}, {0x30, 0x81}, {0x30, 0x82}, {0x30, 0x83}, {0x30, 0x84}}
signedDataMatch := []byte{0x06, 0x09, 0x2A, 0x86, 0x48, 0x86, 0xF7, 0x0D, 0x01, 0x07}
// Check if Header is correct. There are multiple valid headers.
for i, match := range startHeader {
// If first bytes match, then check for ASN.1 Object Type.
if bytes.HasPrefix(raw, match) {
if bytes.HasPrefix(raw[i+2:], signedDataMatch) {
return true
}
}
}
return false
}
// Lotus123 matches a Lotus 1-2-3 spreadsheet document.
func Lotus123(raw []byte, _ uint32) bool {
if len(raw) <= 20 {
return false
}
version := binary.BigEndian.Uint32(raw)
if version == 0x00000200 {
return raw[6] != 0 && raw[7] == 0
}
return version == 0x00001a00 && raw[20] > 0 && raw[20] < 32
}
// CHM matches a Microsoft Compiled HTML Help file.
func CHM(raw []byte, _ uint32) bool {
return bytes.HasPrefix(raw, []byte("ITSF\003\000\000\000\x60\000\000\000"))
}

View File

@@ -0,0 +1,39 @@
package magic
import (
"bytes"
)
var (
// Woff matches a Web Open Font Format file.
Woff = prefix([]byte("wOFF"))
// Woff2 matches a Web Open Font Format version 2 file.
Woff2 = prefix([]byte("wOF2"))
// Otf matches an OpenType font file.
Otf = prefix([]byte{0x4F, 0x54, 0x54, 0x4F, 0x00})
)
// Ttf matches a TrueType font file.
func Ttf(raw []byte, limit uint32) bool {
if !bytes.HasPrefix(raw, []byte{0x00, 0x01, 0x00, 0x00}) {
return false
}
return !MsAccessAce(raw, limit) && !MsAccessMdb(raw, limit)
}
// Eot matches an Embedded OpenType font file.
func Eot(raw []byte, limit uint32) bool {
return len(raw) > 35 &&
bytes.Equal(raw[34:36], []byte{0x4C, 0x50}) &&
(bytes.Equal(raw[8:11], []byte{0x02, 0x00, 0x01}) ||
bytes.Equal(raw[8:11], []byte{0x01, 0x00, 0x00}) ||
bytes.Equal(raw[8:11], []byte{0x02, 0x00, 0x02}))
}
// Ttc matches a TrueType Collection font file.
func Ttc(raw []byte, limit uint32) bool {
return len(raw) > 7 &&
bytes.HasPrefix(raw, []byte("ttcf")) &&
(bytes.Equal(raw[4:8], []byte{0x00, 0x01, 0x00, 0x00}) ||
bytes.Equal(raw[4:8], []byte{0x00, 0x02, 0x00, 0x00}))
}

View File

@@ -0,0 +1,109 @@
package magic
import (
"bytes"
)
var (
// AVIF matches an AV1 Image File Format still or animated.
// Wikipedia page seems outdated listing image/avif-sequence for animations.
// https://github.com/AOMediaCodec/av1-avif/issues/59
AVIF = ftyp([]byte("avif"), []byte("avis"))
// ThreeGP matches a 3GPP file.
ThreeGP = ftyp(
[]byte("3gp1"), []byte("3gp2"), []byte("3gp3"), []byte("3gp4"),
[]byte("3gp5"), []byte("3gp6"), []byte("3gp7"), []byte("3gs7"),
[]byte("3ge6"), []byte("3ge7"), []byte("3gg6"),
)
// ThreeG2 matches a 3GPP2 file.
ThreeG2 = ftyp(
[]byte("3g24"), []byte("3g25"), []byte("3g26"), []byte("3g2a"),
[]byte("3g2b"), []byte("3g2c"), []byte("KDDI"),
)
// AMp4 matches an audio MP4 file.
AMp4 = ftyp(
// audio for Adobe Flash Player 9+
[]byte("F4A "), []byte("F4B "),
// Apple iTunes AAC-LC (.M4A) Audio
[]byte("M4B "), []byte("M4P "),
// MPEG-4 (.MP4) for SonyPSP
[]byte("MSNV"),
// Nero Digital AAC Audio
[]byte("NDAS"),
)
// Mqv matches a Sony / Mobile QuickTime file.
Mqv = ftyp([]byte("mqt "))
// M4a matches an audio M4A file.
M4a = ftyp([]byte("M4A "))
// M4v matches an Appl4 M4V video file.
M4v = ftyp([]byte("M4V "), []byte("M4VH"), []byte("M4VP"))
// Heic matches a High Efficiency Image Coding (HEIC) file.
Heic = ftyp([]byte("heic"), []byte("heix"))
// HeicSequence matches a High Efficiency Image Coding (HEIC) file sequence.
HeicSequence = ftyp([]byte("hevc"), []byte("hevx"))
// Heif matches a High Efficiency Image File Format (HEIF) file.
Heif = ftyp([]byte("mif1"), []byte("heim"), []byte("heis"), []byte("avic"))
// HeifSequence matches a High Efficiency Image File Format (HEIF) file sequence.
HeifSequence = ftyp([]byte("msf1"), []byte("hevm"), []byte("hevs"), []byte("avcs"))
// Mj2 matches a Motion JPEG 2000 file: https://en.wikipedia.org/wiki/Motion_JPEG_2000.
Mj2 = ftyp([]byte("mj2s"), []byte("mjp2"), []byte("MFSM"), []byte("MGSV"))
// Dvb matches a Digital Video Broadcasting file: https://dvb.org.
// https://cconcolato.github.io/mp4ra/filetype.html
// https://github.com/file/file/blob/512840337ead1076519332d24fefcaa8fac36e06/magic/Magdir/animation#L135-L154
Dvb = ftyp(
[]byte("dby1"), []byte("dsms"), []byte("dts1"), []byte("dts2"),
[]byte("dts3"), []byte("dxo "), []byte("dmb1"), []byte("dmpf"),
[]byte("drc1"), []byte("dv1a"), []byte("dv1b"), []byte("dv2a"),
[]byte("dv2b"), []byte("dv3a"), []byte("dv3b"), []byte("dvr1"),
[]byte("dvt1"), []byte("emsg"))
// TODO: add support for remaining video formats at ftyps.com.
)
// QuickTime matches a QuickTime File Format file.
// https://www.loc.gov/preservation/digital/formats/fdd/fdd000052.shtml
// https://developer.apple.com/library/archive/documentation/QuickTime/QTFF/QTFFChap1/qtff1.html#//apple_ref/doc/uid/TP40000939-CH203-38190
// https://github.com/apache/tika/blob/0f5570691133c75ac4472c3340354a6c4080b104/tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml#L7758-L7777
func QuickTime(raw []byte, _ uint32) bool {
if len(raw) < 12 {
return false
}
// First 4 bytes represent the size of the atom as unsigned int.
// Next 4 bytes are the type of the atom.
// For `ftyp` atoms check if first byte in size is 0, otherwise, a text file
// which happens to contain 'ftypqt ' at index 4 will trigger a false positive.
if bytes.Equal(raw[4:12], []byte("ftypqt ")) ||
bytes.Equal(raw[4:12], []byte("ftypmoov")) {
return raw[0] == 0x00
}
basicAtomTypes := [][]byte{
[]byte("moov\x00"),
[]byte("mdat\x00"),
[]byte("free\x00"),
[]byte("skip\x00"),
[]byte("pnot\x00"),
}
for _, a := range basicAtomTypes {
if bytes.Equal(raw[4:9], a) {
return true
}
}
return bytes.Equal(raw[:8], []byte("\x00\x00\x00\x08wide"))
}
// Mp4 detects an .mp4 file. Mp4 detections only does a basic ftyp check.
// Mp4 has many registered and unregistered code points so it's hard to keep track
// of all. Detection will default on video/mp4 for all ftyp files.
// ISO_IEC_14496-12 is the specification for the iso container.
func Mp4(raw []byte, _ uint32) bool {
if len(raw) < 12 {
return false
}
// ftyps are made out of boxes. The first 4 bytes of the box represent
// its size in big-endian uint32. First box is the ftyp box and it is small
// in size. Check most significant byte is 0 to filter out false positive
// text files that happen to contain the string "ftyp" at index 4.
if raw[0] != 0 {
return false
}
return bytes.Equal(raw[4:8], []byte("ftyp"))
}

View File

@@ -0,0 +1,55 @@
package magic
import (
"bytes"
"encoding/binary"
)
// Shp matches a shape format file.
// https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
func Shp(raw []byte, limit uint32) bool {
if len(raw) < 112 {
return false
}
if binary.BigEndian.Uint32(raw[0:4]) != 9994 ||
binary.BigEndian.Uint32(raw[4:8]) != 0 ||
binary.BigEndian.Uint32(raw[8:12]) != 0 ||
binary.BigEndian.Uint32(raw[12:16]) != 0 ||
binary.BigEndian.Uint32(raw[16:20]) != 0 ||
binary.BigEndian.Uint32(raw[20:24]) != 0 ||
binary.LittleEndian.Uint32(raw[28:32]) != 1000 {
return false
}
shapeTypes := []int{
0, // Null shape
1, // Point
3, // Polyline
5, // Polygon
8, // MultiPoint
11, // PointZ
13, // PolylineZ
15, // PolygonZ
18, // MultiPointZ
21, // PointM
23, // PolylineM
25, // PolygonM
28, // MultiPointM
31, // MultiPatch
}
for _, st := range shapeTypes {
if st == int(binary.LittleEndian.Uint32(raw[108:112])) {
return true
}
}
return false
}
// Shx matches a shape index format file.
// https://www.esri.com/library/whitepapers/pdfs/shapefile.pdf
func Shx(raw []byte, limit uint32) bool {
return bytes.HasPrefix(raw, []byte{0x00, 0x00, 0x27, 0x0A})
}

View File

@@ -0,0 +1,110 @@
package magic
import "bytes"
var (
// Png matches a Portable Network Graphics file.
// https://www.w3.org/TR/PNG/
Png = prefix([]byte{0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A})
// Apng matches an Animated Portable Network Graphics file.
// https://wiki.mozilla.org/APNG_Specification
Apng = offset([]byte("acTL"), 37)
// Jpg matches a Joint Photographic Experts Group file.
Jpg = prefix([]byte{0xFF, 0xD8, 0xFF})
// Jp2 matches a JPEG 2000 Image file (ISO 15444-1).
Jp2 = jpeg2k([]byte{0x6a, 0x70, 0x32, 0x20})
// Jpx matches a JPEG 2000 Image file (ISO 15444-2).
Jpx = jpeg2k([]byte{0x6a, 0x70, 0x78, 0x20})
// Jpm matches a JPEG 2000 Image file (ISO 15444-6).
Jpm = jpeg2k([]byte{0x6a, 0x70, 0x6D, 0x20})
// Gif matches a Graphics Interchange Format file.
Gif = prefix([]byte("GIF87a"), []byte("GIF89a"))
// Bmp matches a bitmap image file.
Bmp = prefix([]byte{0x42, 0x4D})
// Ps matches a PostScript file.
Ps = prefix([]byte("%!PS-Adobe-"))
// Psd matches a Photoshop Document file.
Psd = prefix([]byte("8BPS"))
// Ico matches an ICO file.
Ico = prefix([]byte{0x00, 0x00, 0x01, 0x00}, []byte{0x00, 0x00, 0x02, 0x00})
// Icns matches an ICNS (Apple Icon Image format) file.
Icns = prefix([]byte("icns"))
// Tiff matches a Tagged Image File Format file.
Tiff = prefix([]byte{0x49, 0x49, 0x2A, 0x00}, []byte{0x4D, 0x4D, 0x00, 0x2A})
// Bpg matches a Better Portable Graphics file.
Bpg = prefix([]byte{0x42, 0x50, 0x47, 0xFB})
// Xcf matches GIMP image data.
Xcf = prefix([]byte("gimp xcf"))
// Pat matches GIMP pattern data.
Pat = offset([]byte("GPAT"), 20)
// Gbr matches GIMP brush data.
Gbr = offset([]byte("GIMP"), 20)
// Hdr matches Radiance HDR image.
// https://web.archive.org/web/20060913152809/http://local.wasp.uwa.edu.au/~pbourke/dataformats/pic/
Hdr = prefix([]byte("#?RADIANCE\n"))
// Xpm matches X PixMap image data.
Xpm = prefix([]byte{0x2F, 0x2A, 0x20, 0x58, 0x50, 0x4D, 0x20, 0x2A, 0x2F})
// Jxs matches a JPEG XS coded image file (ISO/IEC 21122-3).
Jxs = prefix([]byte{0x00, 0x00, 0x00, 0x0C, 0x4A, 0x58, 0x53, 0x20, 0x0D, 0x0A, 0x87, 0x0A})
// Jxr matches Microsoft HD JXR photo file.
Jxr = prefix([]byte{0x49, 0x49, 0xBC, 0x01})
)
func jpeg2k(sig []byte) Detector {
return func(raw []byte, _ uint32) bool {
if len(raw) < 24 {
return false
}
if !bytes.Equal(raw[4:8], []byte{0x6A, 0x50, 0x20, 0x20}) &&
!bytes.Equal(raw[4:8], []byte{0x6A, 0x50, 0x32, 0x20}) {
return false
}
return bytes.Equal(raw[20:24], sig)
}
}
// Webp matches a WebP file.
func Webp(raw []byte, _ uint32) bool {
return len(raw) > 12 &&
bytes.Equal(raw[0:4], []byte("RIFF")) &&
bytes.Equal(raw[8:12], []byte{0x57, 0x45, 0x42, 0x50})
}
// Dwg matches a CAD drawing file.
func Dwg(raw []byte, _ uint32) bool {
if len(raw) < 6 || raw[0] != 0x41 || raw[1] != 0x43 {
return false
}
dwgVersions := [][]byte{
{0x31, 0x2E, 0x34, 0x30},
{0x31, 0x2E, 0x35, 0x30},
{0x32, 0x2E, 0x31, 0x30},
{0x31, 0x30, 0x30, 0x32},
{0x31, 0x30, 0x30, 0x33},
{0x31, 0x30, 0x30, 0x34},
{0x31, 0x30, 0x30, 0x36},
{0x31, 0x30, 0x30, 0x39},
{0x31, 0x30, 0x31, 0x32},
{0x31, 0x30, 0x31, 0x34},
{0x31, 0x30, 0x31, 0x35},
{0x31, 0x30, 0x31, 0x38},
{0x31, 0x30, 0x32, 0x31},
{0x31, 0x30, 0x32, 0x34},
{0x31, 0x30, 0x33, 0x32},
}
for _, d := range dwgVersions {
if bytes.Equal(raw[2:6], d) {
return true
}
}
return false
}
// Jxl matches JPEG XL image file.
func Jxl(raw []byte, _ uint32) bool {
return bytes.HasPrefix(raw, []byte{0xFF, 0x0A}) ||
bytes.HasPrefix(raw, []byte("\x00\x00\x00\x0cJXL\x20\x0d\x0a\x87\x0a"))
}

View File

@@ -0,0 +1,212 @@
// Package magic holds the matching functions used to find MIME types.
package magic
import (
"bytes"
"fmt"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
type (
// Detector receiveѕ the raw data of a file and returns whether the data
// meets any conditions. The limit parameter is an upper limit to the number
// of bytes received and is used to tell if the byte slice represents the
// whole file or is just the header of a file: len(raw) < limit or len(raw)>limit.
Detector func(raw []byte, limit uint32) bool
xmlSig struct {
// the local name of the root tag
localName []byte
// the namespace of the XML document
xmlns []byte
}
)
// prefix creates a Detector which returns true if any of the provided signatures
// is the prefix of the raw input.
func prefix(sigs ...[]byte) Detector {
return func(raw []byte, limit uint32) bool {
for _, s := range sigs {
if bytes.HasPrefix(raw, s) {
return true
}
}
return false
}
}
// offset creates a Detector which returns true if the provided signature can be
// found at offset in the raw input.
func offset(sig []byte, offset int) Detector {
return func(raw []byte, limit uint32) bool {
return len(raw) > offset && bytes.HasPrefix(raw[offset:], sig)
}
}
// ciPrefix is like prefix but the check is case insensitive.
func ciPrefix(sigs ...[]byte) Detector {
return func(raw []byte, limit uint32) bool {
for _, s := range sigs {
if ciCheck(s, raw) {
return true
}
}
return false
}
}
func ciCheck(sig, raw []byte) bool {
if len(raw) < len(sig)+1 {
return false
}
// perform case insensitive check
for i, b := range sig {
db := raw[i]
if 'A' <= b && b <= 'Z' {
db &= 0xDF
}
if b != db {
return false
}
}
return true
}
// xml creates a Detector which returns true if any of the provided XML signatures
// matches the raw input.
func xml(sigs ...xmlSig) Detector {
return func(raw []byte, limit uint32) bool {
b := scan.Bytes(raw)
b.TrimLWS()
if len(b) == 0 {
return false
}
for _, s := range sigs {
if xmlCheck(s, b) {
return true
}
}
return false
}
}
func xmlCheck(sig xmlSig, raw []byte) bool {
raw = raw[:min(len(raw), 512)]
if len(sig.localName) == 0 {
return bytes.Index(raw, sig.xmlns) > 0
}
if len(sig.xmlns) == 0 {
return bytes.Index(raw, sig.localName) > 0
}
localNameIndex := bytes.Index(raw, sig.localName)
return localNameIndex != -1 && localNameIndex < bytes.Index(raw, sig.xmlns)
}
// markup creates a Detector which returns true is any of the HTML signatures
// matches the raw input.
func markup(sigs ...[]byte) Detector {
return func(raw []byte, limit uint32) bool {
b := scan.Bytes(raw)
if bytes.HasPrefix(b, []byte{0xEF, 0xBB, 0xBF}) {
// We skip the UTF-8 BOM if present to ensure we correctly
// process any leading whitespace. The presence of the BOM
// is taken into account during charset detection in charset.go.
b.Advance(3)
}
b.TrimLWS()
if len(b) == 0 {
return false
}
for _, s := range sigs {
if markupCheck(s, b) {
return true
}
}
return false
}
}
func markupCheck(sig, raw []byte) bool {
if len(raw) < len(sig)+1 {
return false
}
// perform case insensitive check
for i, b := range sig {
db := raw[i]
if 'A' <= b && b <= 'Z' {
db &= 0xDF
}
if b != db {
return false
}
}
// Next byte must be space or right angle bracket.
if db := raw[len(sig)]; !scan.ByteIsWS(db) && db != '>' {
return false
}
return true
}
// ftyp creates a Detector which returns true if any of the FTYP signatures
// matches the raw input.
func ftyp(sigs ...[]byte) Detector {
return func(raw []byte, limit uint32) bool {
if len(raw) < 12 {
return false
}
for _, s := range sigs {
if bytes.Equal(raw[8:12], s) {
return true
}
}
return false
}
}
func newXMLSig(localName, xmlns string) xmlSig {
ret := xmlSig{xmlns: []byte(xmlns)}
if localName != "" {
ret.localName = []byte(fmt.Sprintf("<%s", localName))
}
return ret
}
// A valid shebang starts with the "#!" characters,
// followed by any number of spaces,
// followed by the path to the interpreter,
// and, optionally, followed by the arguments for the interpreter.
//
// Ex:
//
// #! /usr/bin/env php
//
// /usr/bin/env is the interpreter, php is the first and only argument.
func shebang(sigs ...[]byte) Detector {
return func(raw []byte, limit uint32) bool {
b := scan.Bytes(raw)
line := b.Line()
for _, s := range sigs {
if shebangCheck(s, line) {
return true
}
}
return false
}
}
func shebangCheck(sig []byte, raw scan.Bytes) bool {
if len(raw) < len(sig)+2 {
return false
}
if raw[0] != '#' || raw[1] != '!' {
return false
}
raw.Advance(2) // skip #! we checked above
raw.TrimLWS()
raw.TrimRWS()
return bytes.Equal(raw, sig)
}

View File

@@ -0,0 +1,211 @@
package magic
import (
"bytes"
"encoding/binary"
)
// Xlsx matches a Microsoft Excel 2007 file.
func Xlsx(raw []byte, limit uint32) bool {
return msoxml(raw, zipEntries{{
name: []byte("xl/"),
dir: true,
}}, 100)
}
// Docx matches a Microsoft Word 2007 file.
func Docx(raw []byte, limit uint32) bool {
return msoxml(raw, zipEntries{{
name: []byte("word/"),
dir: true,
}}, 100)
}
// Pptx matches a Microsoft PowerPoint 2007 file.
func Pptx(raw []byte, limit uint32) bool {
return msoxml(raw, zipEntries{{
name: []byte("ppt/"),
dir: true,
}}, 100)
}
// Visio matches a Microsoft Visio 2013+ file.
func Visio(raw []byte, limit uint32) bool {
return msoxml(raw, zipEntries{{
name: []byte("visio/"),
dir: true,
}}, 100)
}
// Ole matches an Open Linking and Embedding file.
//
// https://en.wikipedia.org/wiki/Object_Linking_and_Embedding
func Ole(raw []byte, limit uint32) bool {
return bytes.HasPrefix(raw, []byte{0xD0, 0xCF, 0x11, 0xE0, 0xA1, 0xB1, 0x1A, 0xE1})
}
// Aaf matches an Advanced Authoring Format file.
// See: https://pyaaf.readthedocs.io/en/latest/about.html
// See: https://en.wikipedia.org/wiki/Advanced_Authoring_Format
func Aaf(raw []byte, limit uint32) bool {
if len(raw) < 31 {
return false
}
return bytes.HasPrefix(raw[8:], []byte{0x41, 0x41, 0x46, 0x42, 0x0D, 0x00, 0x4F, 0x4D}) &&
(raw[30] == 0x09 || raw[30] == 0x0C)
}
// Doc matches a Microsoft Word 97-2003 file.
// See: https://github.com/decalage2/oletools/blob/412ee36ae45e70f42123e835871bac956d958461/oletools/common/clsid.py
func Doc(raw []byte, _ uint32) bool {
clsids := [][]byte{
// Microsoft Word 97-2003 Document (Word.Document.8)
{0x06, 0x09, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0xc0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46},
// Microsoft Word 6.0-7.0 Document (Word.Document.6)
{0x00, 0x09, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0xc0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46},
// Microsoft Word Picture (Word.Picture.8)
{0x07, 0x09, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00, 0xc0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46},
}
for _, clsid := range clsids {
if matchOleClsid(raw, clsid) {
return true
}
}
return false
}
// Ppt matches a Microsoft PowerPoint 97-2003 file or a PowerPoint 95 presentation.
func Ppt(raw []byte, limit uint32) bool {
// Root CLSID test is the safest way to detect identify OLE, however, the format
// often places the root CLSID at the end of the file.
if matchOleClsid(raw, []byte{
0x10, 0x8d, 0x81, 0x64, 0x9b, 0x4f, 0xcf, 0x11,
0x86, 0xea, 0x00, 0xaa, 0x00, 0xb9, 0x29, 0xe8,
}) || matchOleClsid(raw, []byte{
0x70, 0xae, 0x7b, 0xea, 0x3b, 0xfb, 0xcd, 0x11,
0xa9, 0x03, 0x00, 0xaa, 0x00, 0x51, 0x0e, 0xa3,
}) {
return true
}
lin := len(raw)
if lin < 520 {
return false
}
pptSubHeaders := [][]byte{
{0xA0, 0x46, 0x1D, 0xF0},
{0x00, 0x6E, 0x1E, 0xF0},
{0x0F, 0x00, 0xE8, 0x03},
}
for _, h := range pptSubHeaders {
if bytes.HasPrefix(raw[512:], h) {
return true
}
}
if bytes.HasPrefix(raw[512:], []byte{0xFD, 0xFF, 0xFF, 0xFF}) &&
raw[518] == 0x00 && raw[519] == 0x00 {
return true
}
return lin > 1152 && bytes.Contains(raw[1152:min(4096, lin)],
[]byte("P\x00o\x00w\x00e\x00r\x00P\x00o\x00i\x00n\x00t\x00 D\x00o\x00c\x00u\x00m\x00e\x00n\x00t"))
}
// Xls matches a Microsoft Excel 97-2003 file.
func Xls(raw []byte, limit uint32) bool {
// Root CLSID test is the safest way to detect identify OLE, however, the format
// often places the root CLSID at the end of the file.
if matchOleClsid(raw, []byte{
0x10, 0x08, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
}) || matchOleClsid(raw, []byte{
0x20, 0x08, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
}) {
return true
}
lin := len(raw)
if lin < 520 {
return false
}
xlsSubHeaders := [][]byte{
{0x09, 0x08, 0x10, 0x00, 0x00, 0x06, 0x05, 0x00},
{0xFD, 0xFF, 0xFF, 0xFF, 0x10},
{0xFD, 0xFF, 0xFF, 0xFF, 0x1F},
{0xFD, 0xFF, 0xFF, 0xFF, 0x22},
{0xFD, 0xFF, 0xFF, 0xFF, 0x23},
{0xFD, 0xFF, 0xFF, 0xFF, 0x28},
{0xFD, 0xFF, 0xFF, 0xFF, 0x29},
}
for _, h := range xlsSubHeaders {
if bytes.HasPrefix(raw[512:], h) {
return true
}
}
return lin > 1152 && bytes.Contains(raw[1152:min(4096, lin)],
[]byte("W\x00k\x00s\x00S\x00S\x00W\x00o\x00r\x00k\x00B\x00o\x00o\x00k"))
}
// Pub matches a Microsoft Publisher file.
func Pub(raw []byte, limit uint32) bool {
return matchOleClsid(raw, []byte{
0x01, 0x12, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0xC0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46,
})
}
// Msg matches a Microsoft Outlook email file.
func Msg(raw []byte, limit uint32) bool {
return matchOleClsid(raw, []byte{
0x0B, 0x0D, 0x02, 0x00, 0x00, 0x00, 0x00, 0x00,
0xC0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46,
})
}
// Msi matches a Microsoft Windows Installer file.
// http://fileformats.archiveteam.org/wiki/Microsoft_Compound_File
func Msi(raw []byte, limit uint32) bool {
return matchOleClsid(raw, []byte{
0x84, 0x10, 0x0C, 0x00, 0x00, 0x00, 0x00, 0x00,
0xC0, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x46,
})
}
// One matches a Microsoft OneNote file.
func One(raw []byte, limit uint32) bool {
return bytes.HasPrefix(raw, []byte{
0xe4, 0x52, 0x5c, 0x7b, 0x8c, 0xd8, 0xa7, 0x4d,
0xae, 0xb1, 0x53, 0x78, 0xd0, 0x29, 0x96, 0xd3,
})
}
// Helper to match by a specific CLSID of a compound file.
//
// http://fileformats.archiveteam.org/wiki/Microsoft_Compound_File
func matchOleClsid(in []byte, clsid []byte) bool {
// Microsoft Compound files v3 have a sector length of 512, while v4 has 4096.
// Change sector offset depending on file version.
// https://www.loc.gov/preservation/digital/formats/fdd/fdd000392.shtml
sectorLength := 512
if len(in) < sectorLength {
return false
}
if in[26] == 0x04 && in[27] == 0x00 {
sectorLength = 4096
}
// SecID of first sector of the directory stream.
firstSecID := int(binary.LittleEndian.Uint32(in[48:52]))
// Expected offset of CLSID for root storage object.
clsidOffset := sectorLength*(1+firstSecID) + 80
if len(in) <= clsidOffset+16 {
return false
}
return bytes.HasPrefix(in[clsidOffset:], clsid)
}

View File

@@ -0,0 +1,111 @@
package magic
import (
"bytes"
"strconv"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
// NetPBM matches a Netpbm Portable BitMap ASCII/Binary file.
//
// See: https://en.wikipedia.org/wiki/Netpbm
func NetPBM(raw []byte, _ uint32) bool {
return netp(raw, "P1\n", "P4\n")
}
// NetPGM matches a Netpbm Portable GrayMap ASCII/Binary file.
//
// See: https://en.wikipedia.org/wiki/Netpbm
func NetPGM(raw []byte, _ uint32) bool {
return netp(raw, "P2\n", "P5\n")
}
// NetPPM matches a Netpbm Portable PixMap ASCII/Binary file.
//
// See: https://en.wikipedia.org/wiki/Netpbm
func NetPPM(raw []byte, _ uint32) bool {
return netp(raw, "P3\n", "P6\n")
}
// NetPAM matches a Netpbm Portable Arbitrary Map file.
//
// See: https://en.wikipedia.org/wiki/Netpbm
func NetPAM(raw []byte, _ uint32) bool {
if !bytes.HasPrefix(raw, []byte("P7\n")) {
return false
}
w, h, d, m, e := false, false, false, false, false
s := scan.Bytes(raw)
var l scan.Bytes
// Read line by line.
for i := 0; i < 128; i++ {
l = s.Line()
// If the line is empty or a comment, skip.
if len(l) == 0 || l.Peek() == '#' {
if len(s) == 0 {
return false
}
continue
} else if bytes.HasPrefix(l, []byte("TUPLTYPE")) {
continue
} else if bytes.HasPrefix(l, []byte("WIDTH ")) {
w = true
} else if bytes.HasPrefix(l, []byte("HEIGHT ")) {
h = true
} else if bytes.HasPrefix(l, []byte("DEPTH ")) {
d = true
} else if bytes.HasPrefix(l, []byte("MAXVAL ")) {
m = true
} else if bytes.HasPrefix(l, []byte("ENDHDR")) {
e = true
}
// When we reached header, return true if we collected all four required headers.
// WIDTH, HEIGHT, DEPTH and MAXVAL.
if e {
return w && h && d && m
}
}
return false
}
func netp(s scan.Bytes, prefixes ...string) bool {
foundPrefix := ""
for _, p := range prefixes {
if bytes.HasPrefix(s, []byte(p)) {
foundPrefix = p
}
}
if foundPrefix == "" {
return false
}
s.Advance(len(foundPrefix)) // jump over P1, P2, P3, etc.
var l scan.Bytes
// Read line by line.
for i := 0; i < 128; i++ {
l = s.Line()
// If the line is a comment, skip.
if l.Peek() == '#' {
continue
}
// If line has leading whitespace, then skip over whitespace.
for scan.ByteIsWS(l.Peek()) {
l.Advance(1)
}
if len(s) == 0 || len(l) > 0 {
break
}
}
// At this point l should be the two integers denoting the size of the matrix.
width := l.PopUntil(scan.ASCIISpaces...)
for scan.ByteIsWS(l.Peek()) {
l.Advance(1)
}
height := l.PopUntil(scan.ASCIISpaces...)
w, errw := strconv.ParseInt(string(width), 10, 64)
h, errh := strconv.ParseInt(string(height), 10, 64)
return errw == nil && errh == nil && w > 0 && h > 0
}

View File

@@ -0,0 +1,42 @@
package magic
import (
"bytes"
)
/*
NOTE:
In May 2003, two Internet RFCs were published relating to the format.
The Ogg bitstream was defined in RFC 3533 (which is classified as
'informative') and its Internet content type (application/ogg) in RFC
3534 (which is, as of 2006, a proposed standard protocol). In
September 2008, RFC 3534 was obsoleted by RFC 5334, which added
content types video/ogg, audio/ogg and filename extensions .ogx, .ogv,
.oga, .spx.
See:
https://tools.ietf.org/html/rfc3533
https://developer.mozilla.org/en-US/docs/Web/HTTP/Configuring_servers_for_Ogg_media#Serve_media_with_the_correct_MIME_type
https://github.com/file/file/blob/master/magic/Magdir/vorbis
*/
// Ogg matches an Ogg file.
func Ogg(raw []byte, limit uint32) bool {
return bytes.HasPrefix(raw, []byte("\x4F\x67\x67\x53\x00"))
}
// OggAudio matches an audio ogg file.
func OggAudio(raw []byte, limit uint32) bool {
return len(raw) >= 37 && (bytes.HasPrefix(raw[28:], []byte("\x7fFLAC")) ||
bytes.HasPrefix(raw[28:], []byte("\x01vorbis")) ||
bytes.HasPrefix(raw[28:], []byte("OpusHead")) ||
bytes.HasPrefix(raw[28:], []byte("Speex\x20\x20\x20")))
}
// OggVideo matches a video ogg file.
func OggVideo(raw []byte, limit uint32) bool {
return len(raw) >= 37 && (bytes.HasPrefix(raw[28:], []byte("\x80theora")) ||
bytes.HasPrefix(raw[28:], []byte("fishead\x00")) ||
bytes.HasPrefix(raw[28:], []byte("\x01video\x00\x00\x00"))) // OGM video
}

View File

@@ -0,0 +1,411 @@
package magic
import (
"bytes"
"time"
"github.com/gabriel-vasile/mimetype/internal/charset"
"github.com/gabriel-vasile/mimetype/internal/json"
mkup "github.com/gabriel-vasile/mimetype/internal/markup"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
var (
// HTML matches a Hypertext Markup Language file.
HTML = markup(
[]byte("<!DOCTYPE HTML"),
[]byte("<HTML"),
[]byte("<HEAD"),
[]byte("<SCRIPT"),
[]byte("<IFRAME"),
[]byte("<H1"),
[]byte("<DIV"),
[]byte("<FONT"),
[]byte("<TABLE"),
[]byte("<A"),
[]byte("<STYLE"),
[]byte("<TITLE"),
[]byte("<B"),
[]byte("<BODY"),
[]byte("<BR"),
[]byte("<P"),
[]byte("<!--"),
)
// XML matches an Extensible Markup Language file.
XML = markup([]byte("<?XML"))
// Owl2 matches an Owl ontology file.
Owl2 = xml(newXMLSig("Ontology", `xmlns="http://www.w3.org/2002/07/owl#"`))
// Rss matches a Rich Site Summary file.
Rss = xml(newXMLSig("rss", ""))
// Atom matches an Atom Syndication Format file.
Atom = xml(newXMLSig("feed", `xmlns="http://www.w3.org/2005/Atom"`))
// Kml matches a Keyhole Markup Language file.
Kml = xml(
newXMLSig("kml", `xmlns="http://www.opengis.net/kml/2.2"`),
newXMLSig("kml", `xmlns="http://earth.google.com/kml/2.0"`),
newXMLSig("kml", `xmlns="http://earth.google.com/kml/2.1"`),
newXMLSig("kml", `xmlns="http://earth.google.com/kml/2.2"`),
)
// Xliff matches a XML Localization Interchange File Format file.
Xliff = xml(newXMLSig("xliff", `xmlns="urn:oasis:names:tc:xliff:document:1.2"`))
// Collada matches a COLLAborative Design Activity file.
Collada = xml(newXMLSig("COLLADA", `xmlns="http://www.collada.org/2005/11/COLLADASchema"`))
// Gml matches a Geography Markup Language file.
Gml = xml(
newXMLSig("", `xmlns:gml="http://www.opengis.net/gml"`),
newXMLSig("", `xmlns:gml="http://www.opengis.net/gml/3.2"`),
newXMLSig("", `xmlns:gml="http://www.opengis.net/gml/3.3/exr"`),
)
// Gpx matches a GPS Exchange Format file.
Gpx = xml(newXMLSig("gpx", `xmlns="http://www.topografix.com/GPX/1/1"`))
// Tcx matches a Training Center XML file.
Tcx = xml(newXMLSig("TrainingCenterDatabase", `xmlns="http://www.garmin.com/xmlschemas/TrainingCenterDatabase/v2"`))
// X3d matches an Extensible 3D Graphics file.
X3d = xml(newXMLSig("X3D", `xmlns:xsd="http://www.w3.org/2001/XMLSchema-instance"`))
// Amf matches an Additive Manufacturing XML file.
Amf = xml(newXMLSig("amf", ""))
// Threemf matches a 3D Manufacturing Format file.
Threemf = xml(newXMLSig("model", `xmlns="http://schemas.microsoft.com/3dmanufacturing/core/2015/02"`))
// Xfdf matches a XML Forms Data Format file.
Xfdf = xml(newXMLSig("xfdf", `xmlns="http://ns.adobe.com/xfdf/"`))
// VCard matches a Virtual Contact File.
VCard = ciPrefix([]byte("BEGIN:VCARD\n"), []byte("BEGIN:VCARD\r\n"))
// ICalendar matches a iCalendar file.
ICalendar = ciPrefix([]byte("BEGIN:VCALENDAR\n"), []byte("BEGIN:VCALENDAR\r\n"))
phpPageF = ciPrefix(
[]byte("<?PHP"),
[]byte("<?\n"),
[]byte("<?\r"),
[]byte("<? "),
)
phpScriptF = shebang(
[]byte("/usr/local/bin/php"),
[]byte("/usr/bin/php"),
[]byte("/usr/bin/env php"),
)
// Js matches a Javascript file.
Js = shebang(
[]byte("/bin/node"),
[]byte("/usr/bin/node"),
[]byte("/bin/nodejs"),
[]byte("/usr/bin/nodejs"),
[]byte("/usr/bin/env node"),
[]byte("/usr/bin/env nodejs"),
)
// Lua matches a Lua programming language file.
Lua = shebang(
[]byte("/usr/bin/lua"),
[]byte("/usr/local/bin/lua"),
[]byte("/usr/bin/env lua"),
)
// Perl matches a Perl programming language file.
Perl = shebang(
[]byte("/usr/bin/perl"),
[]byte("/usr/bin/env perl"),
)
// Python matches a Python programming language file.
Python = shebang(
[]byte("/usr/bin/python"),
[]byte("/usr/local/bin/python"),
[]byte("/usr/bin/env python"),
[]byte("/usr/bin/python2"),
[]byte("/usr/local/bin/python2"),
[]byte("/usr/bin/env python2"),
[]byte("/usr/bin/python3"),
[]byte("/usr/local/bin/python3"),
[]byte("/usr/bin/env python3"),
)
// Ruby matches a Ruby programming language file.
Ruby = shebang(
[]byte("/usr/bin/ruby"),
[]byte("/usr/local/bin/ruby"),
[]byte("/usr/bin/env ruby"),
)
// Tcl matches a Tcl programming language file.
Tcl = shebang(
[]byte("/usr/bin/tcl"),
[]byte("/usr/local/bin/tcl"),
[]byte("/usr/bin/env tcl"),
[]byte("/usr/bin/tclsh"),
[]byte("/usr/local/bin/tclsh"),
[]byte("/usr/bin/env tclsh"),
[]byte("/usr/bin/wish"),
[]byte("/usr/local/bin/wish"),
[]byte("/usr/bin/env wish"),
)
// Rtf matches a Rich Text Format file.
Rtf = prefix([]byte("{\\rtf"))
// Shell matches a shell script file.
Shell = shebang(
[]byte("/bin/sh"),
[]byte("/bin/bash"),
[]byte("/usr/local/bin/bash"),
[]byte("/usr/bin/env bash"),
[]byte("/bin/csh"),
[]byte("/usr/local/bin/csh"),
[]byte("/usr/bin/env csh"),
[]byte("/bin/dash"),
[]byte("/usr/local/bin/dash"),
[]byte("/usr/bin/env dash"),
[]byte("/bin/ksh"),
[]byte("/usr/local/bin/ksh"),
[]byte("/usr/bin/env ksh"),
[]byte("/bin/tcsh"),
[]byte("/usr/local/bin/tcsh"),
[]byte("/usr/bin/env tcsh"),
[]byte("/bin/zsh"),
[]byte("/usr/local/bin/zsh"),
[]byte("/usr/bin/env zsh"),
)
)
// Text matches a plain text file.
//
// TODO: This function does not parse BOM-less UTF16 and UTF32 files. Not really
// sure it should. Linux file utility also requires a BOM for UTF16 and UTF32.
func Text(raw []byte, _ uint32) bool {
// First look for BOM.
if cset := charset.FromBOM(raw); cset != "" {
return true
}
// Binary data bytes as defined here: https://mimesniff.spec.whatwg.org/#binary-data-byte
for i := 0; i < min(len(raw), 4096); i++ {
b := raw[i]
if b <= 0x08 ||
b == 0x0B ||
0x0E <= b && b <= 0x1A ||
0x1C <= b && b <= 0x1F {
return false
}
}
return true
}
// XHTML matches an XHTML file. This check depends on the XML check to have passed.
func XHTML(raw []byte, limit uint32) bool {
raw = raw[:min(len(raw), 4096)]
b := scan.Bytes(raw)
return b.Search([]byte("<!DOCTYPE HTML"), scan.CompactWS|scan.IgnoreCase) != -1 ||
b.Search([]byte("<HTML XMLNS="), scan.CompactWS|scan.IgnoreCase) != -1
}
// Php matches a PHP: Hypertext Preprocessor file.
func Php(raw []byte, limit uint32) bool {
if res := phpPageF(raw, limit); res {
return res
}
return phpScriptF(raw, limit)
}
// JSON matches a JavaScript Object Notation file.
func JSON(raw []byte, limit uint32) bool {
// #175 A single JSON string, number or bool is not considered JSON.
// JSON objects and arrays are reported as JSON.
return jsonHelper(raw, limit, json.QueryNone, json.TokObject|json.TokArray)
}
// GeoJSON matches a RFC 7946 GeoJSON file.
//
// GeoJSON detection implies searching for key:value pairs like: `"type": "Feature"`
// in the input.
func GeoJSON(raw []byte, limit uint32) bool {
return jsonHelper(raw, limit, json.QueryGeo, json.TokObject)
}
// HAR matches a HAR Spec file.
// Spec: http://www.softwareishard.com/blog/har-12-spec/
func HAR(raw []byte, limit uint32) bool {
return jsonHelper(raw, limit, json.QueryHAR, json.TokObject)
}
// GLTF matches a GL Transmission Format (JSON) file.
// Visit [glTF specification] and [IANA glTF entry] for more details.
//
// [glTF specification]: https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html
// [IANA glTF entry]: https://www.iana.org/assignments/media-types/model/gltf+json
func GLTF(raw []byte, limit uint32) bool {
return jsonHelper(raw, limit, json.QueryGLTF, json.TokObject)
}
func jsonHelper(raw []byte, limit uint32, q string, wantTok int) bool {
if !json.LooksLikeObjectOrArray(raw) {
return false
}
lraw := len(raw)
parsed, inspected, firstToken, querySatisfied := json.Parse(q, raw)
if !querySatisfied || firstToken&wantTok == 0 {
return false
}
// If the full file content was provided, check that the whole input was parsed.
if limit == 0 || lraw < int(limit) {
return parsed == lraw
}
// If a section of the file was provided, check if all of it was inspected.
// In other words, check that if there was a problem parsing, that problem
// occured at the last byte in the input.
return inspected == lraw && lraw > 0
}
// NdJSON matches a Newline delimited JSON file. All complete lines from raw
// must be valid JSON documents meaning they contain one of the valid JSON data
// types.
func NdJSON(raw []byte, limit uint32) bool {
lCount, objOrArr := 0, 0
s := scan.Bytes(raw)
s.DropLastLine(limit)
var l scan.Bytes
for len(s) != 0 {
l = s.Line()
_, inspected, firstToken, _ := json.Parse(json.QueryNone, l)
if len(l) != inspected {
return false
}
if firstToken == json.TokArray || firstToken == json.TokObject {
objOrArr++
}
lCount++
}
return lCount > 1 && objOrArr > 0
}
// Svg matches a SVG file.
func Svg(raw []byte, limit uint32) bool {
return svgWithoutXMLDeclaration(raw) || svgWithXMLDeclaration(raw)
}
// svgWithoutXMLDeclaration matches a SVG image that does not have an XML header.
// Example:
//
// <!-- xml comment ignored -->
// <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
// <rect fill="#fff" stroke="#000" x="-70" y="-70" width="390" height="390"/>
// </svg>
func svgWithoutXMLDeclaration(s scan.Bytes) bool {
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
for mkup.SkipAComment(&s) {
}
if !bytes.HasPrefix(s, []byte("<svg")) {
return false
}
targetName, targetVal := "xmlns", "http://www.w3.org/2000/svg"
aName, aVal, hasMore := "", "", true
for hasMore {
aName, aVal, hasMore = mkup.GetAnAttribute(&s)
if aName == targetName && aVal == targetVal {
return true
}
if !hasMore {
return false
}
}
return false
}
// svgWithXMLDeclaration matches a SVG image that has an XML header.
// Example:
//
// <?xml version="1.0" encoding="UTF-8" standalone="no"?>
// <svg width="391" height="391" viewBox="-70.5 -70.5 391 391" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
// <rect fill="#fff" stroke="#000" x="-70" y="-70" width="390" height="390"/>
// </svg>
func svgWithXMLDeclaration(s scan.Bytes) bool {
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
if !bytes.HasPrefix(s, []byte("<?xml")) {
return false
}
// version is a required attribute for XML.
hasVersion := false
aName, hasMore := "", true
for hasMore {
aName, _, hasMore = mkup.GetAnAttribute(&s)
if aName == "version" {
hasVersion = true
break
}
if !hasMore {
break
}
}
if len(s) > 4096 {
s = s[:4096]
}
return hasVersion && bytes.Contains(s, []byte("<svg"))
}
// Srt matches a SubRip file.
func Srt(raw []byte, _ uint32) bool {
s := scan.Bytes(raw)
line := s.Line()
// First line must be 1.
if len(line) != 1 || line[0] != '1' {
return false
}
line = s.Line()
// Timestamp format (e.g: 00:02:16,612 --> 00:02:19,376) limits second line
// length to exactly 29 characters.
if len(line) != 29 {
return false
}
// Decimal separator of fractional seconds in the timestamps must be a
// comma, not a period.
if bytes.IndexByte(line, '.') != -1 {
return false
}
sep := []byte(" --> ")
i := bytes.Index(line, sep)
if i == -1 {
return false
}
const layout = "15:04:05,000"
t0, err := time.Parse(layout, string(line[:i]))
if err != nil {
return false
}
t1, err := time.Parse(layout, string(line[i+len(sep):]))
if err != nil {
return false
}
if t0.After(t1) {
return false
}
line = s.Line()
// A third line must exist and not be empty. This is the actual subtitle text.
return len(line) != 0
}
// Vtt matches a Web Video Text Tracks (WebVTT) file. See
// https://www.iana.org/assignments/media-types/text/vtt.
func Vtt(raw []byte, limit uint32) bool {
// Prefix match.
prefixes := [][]byte{
{0xEF, 0xBB, 0xBF, 0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x0A}, // UTF-8 BOM, "WEBVTT" and a line feed
{0xEF, 0xBB, 0xBF, 0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x0D}, // UTF-8 BOM, "WEBVTT" and a carriage return
{0xEF, 0xBB, 0xBF, 0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x20}, // UTF-8 BOM, "WEBVTT" and a space
{0xEF, 0xBB, 0xBF, 0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x09}, // UTF-8 BOM, "WEBVTT" and a horizontal tab
{0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x0A}, // "WEBVTT" and a line feed
{0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x0D}, // "WEBVTT" and a carriage return
{0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x20}, // "WEBVTT" and a space
{0x57, 0x45, 0x42, 0x56, 0x54, 0x54, 0x09}, // "WEBVTT" and a horizontal tab
}
for _, p := range prefixes {
if bytes.HasPrefix(raw, p) {
return true
}
}
// Exact match.
return bytes.Equal(raw, []byte{0xEF, 0xBB, 0xBF, 0x57, 0x45, 0x42, 0x56, 0x54, 0x54}) || // UTF-8 BOM and "WEBVTT"
bytes.Equal(raw, []byte{0x57, 0x45, 0x42, 0x56, 0x54, 0x54}) // "WEBVTT"
}

View File

@@ -0,0 +1,43 @@
package magic
import (
"github.com/gabriel-vasile/mimetype/internal/csv"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
// CSV matches a comma-separated values file.
func CSV(raw []byte, limit uint32) bool {
return sv(raw, ',', limit)
}
// TSV matches a tab-separated values file.
func TSV(raw []byte, limit uint32) bool {
return sv(raw, '\t', limit)
}
func sv(in []byte, comma byte, limit uint32) bool {
s := scan.Bytes(in)
s.DropLastLine(limit)
r := csv.NewParser(comma, '#', s)
headerFields, _, hasMore := r.CountFields(false)
if headerFields < 2 || !hasMore {
return false
}
csvLines := 1 // 1 for header
for {
fields, _, hasMore := r.CountFields(false)
if !hasMore && fields == 0 {
break
}
csvLines++
if fields != headerFields {
return false
}
if csvLines >= 10 {
return true
}
}
return csvLines >= 2
}

View File

@@ -0,0 +1,85 @@
package magic
import (
"bytes"
)
var (
// Flv matches a Flash video file.
Flv = prefix([]byte("\x46\x4C\x56\x01"))
// Asf matches an Advanced Systems Format file.
Asf = prefix([]byte{
0x30, 0x26, 0xB2, 0x75, 0x8E, 0x66, 0xCF, 0x11,
0xA6, 0xD9, 0x00, 0xAA, 0x00, 0x62, 0xCE, 0x6C,
})
// Rmvb matches a RealMedia Variable Bitrate file.
Rmvb = prefix([]byte{0x2E, 0x52, 0x4D, 0x46})
)
// WebM matches a WebM file.
func WebM(raw []byte, limit uint32) bool {
return isMatroskaFileTypeMatched(raw, "webm")
}
// Mkv matches a mkv file.
func Mkv(raw []byte, limit uint32) bool {
return isMatroskaFileTypeMatched(raw, "matroska")
}
// isMatroskaFileTypeMatched is used for webm and mkv file matching.
// It checks for .Eߣ sequence. If the sequence is found,
// then it means it is Matroska media container, including WebM.
// Then it verifies which of the file type it is representing by matching the
// file specific string.
func isMatroskaFileTypeMatched(in []byte, flType string) bool {
if bytes.HasPrefix(in, []byte("\x1A\x45\xDF\xA3")) {
return isFileTypeNamePresent(in, flType)
}
return false
}
// isFileTypeNamePresent accepts the matroska input data stream and searches
// for the given file type in the stream. Return whether a match is found.
// The logic of search is: find first instance of \x42\x82 and then
// search for given string after n bytes of above instance.
func isFileTypeNamePresent(in []byte, flType string) bool {
ind, maxInd, lenIn := 0, 4096, len(in)
if lenIn < maxInd { // restricting length to 4096
maxInd = lenIn
}
ind = bytes.Index(in[:maxInd], []byte("\x42\x82"))
if ind > 0 && lenIn > ind+2 {
ind += 2
// filetype name will be present exactly
// n bytes after the match of the two bytes "\x42\x82"
n := vintWidth(int(in[ind]))
if lenIn > ind+n {
return bytes.HasPrefix(in[ind+n:], []byte(flType))
}
}
return false
}
// vintWidth parses the variable-integer width in matroska containers
func vintWidth(v int) int {
mask, max, num := 128, 8, 1
for num < max && v&mask == 0 {
mask = mask >> 1
num++
}
return num
}
// Mpeg matches a Moving Picture Experts Group file.
func Mpeg(raw []byte, limit uint32) bool {
return len(raw) > 3 && bytes.HasPrefix(raw, []byte{0x00, 0x00, 0x01}) &&
raw[3] >= 0xB0 && raw[3] <= 0xBF
}
// Avi matches an Audio Video Interleaved file.
func Avi(raw []byte, limit uint32) bool {
return len(raw) > 16 &&
bytes.Equal(raw[:4], []byte("RIFF")) &&
bytes.Equal(raw[8:16], []byte("AVI LIST"))
}

View File

@@ -0,0 +1,189 @@
package magic
import (
"bytes"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
var (
// Odt matches an OpenDocument Text file.
Odt = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.text"), 30)
// Ott matches an OpenDocument Text Template file.
Ott = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.text-template"), 30)
// Ods matches an OpenDocument Spreadsheet file.
Ods = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.spreadsheet"), 30)
// Ots matches an OpenDocument Spreadsheet Template file.
Ots = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.spreadsheet-template"), 30)
// Odp matches an OpenDocument Presentation file.
Odp = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.presentation"), 30)
// Otp matches an OpenDocument Presentation Template file.
Otp = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.presentation-template"), 30)
// Odg matches an OpenDocument Drawing file.
Odg = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.graphics"), 30)
// Otg matches an OpenDocument Drawing Template file.
Otg = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.graphics-template"), 30)
// Odf matches an OpenDocument Formula file.
Odf = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.formula"), 30)
// Odc matches an OpenDocument Chart file.
Odc = offset([]byte("mimetypeapplication/vnd.oasis.opendocument.chart"), 30)
// Epub matches an EPUB file.
Epub = offset([]byte("mimetypeapplication/epub+zip"), 30)
// Sxc matches an OpenOffice Spreadsheet file.
Sxc = offset([]byte("mimetypeapplication/vnd.sun.xml.calc"), 30)
)
// Zip matches a zip archive.
func Zip(raw []byte, limit uint32) bool {
return len(raw) > 3 &&
raw[0] == 0x50 && raw[1] == 0x4B &&
(raw[2] == 0x3 || raw[2] == 0x5 || raw[2] == 0x7) &&
(raw[3] == 0x4 || raw[3] == 0x6 || raw[3] == 0x8)
}
// Jar matches a Java archive file. There are two types of Jar files:
// 1. the ones that can be opened with jexec and have 0xCAFE optional flag
// https://stackoverflow.com/tags/executable-jar/info
// 2. regular jars, same as above, just without the executable flag
// https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=262278#c0
// There is an argument to only check for manifest, since it's the common nominator
// for both executable and non-executable versions. But the traversing zip entries
// is unreliable because it does linear search for signatures
// (instead of relying on offsets told by the file.)
func Jar(raw []byte, limit uint32) bool {
return executableJar(raw) ||
zipHas(raw, zipEntries{{
name: []byte("META-INF/MANIFEST.MF"),
}, {
name: []byte("META-INF/"),
}}, 1)
}
// KMZ matches a zipped KML file, which is "doc.kml" by convention.
func KMZ(raw []byte, _ uint32) bool {
return zipHas(raw, zipEntries{{
name: []byte("doc.kml"),
}}, 100)
}
// An executable Jar has a 0xCAFE flag enabled in the first zip entry.
// The rule from file/file is:
// >(26.s+30) leshort 0xcafe Java archive data (JAR)
func executableJar(b scan.Bytes) bool {
b.Advance(0x1A)
offset, ok := b.Uint16()
if !ok {
return false
}
b.Advance(int(offset) + 2)
cafe, ok := b.Uint16()
return ok && cafe == 0xCAFE
}
// zipIterator iterates over a zip file returning the name of the zip entries
// in that file.
type zipIterator struct {
b scan.Bytes
}
type zipEntries []struct {
name []byte
dir bool // dir means checking just the prefix of the entry, not the whole path
}
func (z zipEntries) match(file []byte) bool {
for i := range z {
if z[i].dir && bytes.HasPrefix(file, z[i].name) {
return true
}
if bytes.Equal(file, z[i].name) {
return true
}
}
return false
}
func zipHas(raw scan.Bytes, searchFor zipEntries, stopAfter int) bool {
iter := zipIterator{raw}
for i := 0; i < stopAfter; i++ {
f := iter.next()
if len(f) == 0 {
break
}
if searchFor.match(f) {
return true
}
}
return false
}
// msoxml behaves like zipHas, but it puts restrictions on what the first zip
// entry can be.
func msoxml(raw scan.Bytes, searchFor zipEntries, stopAfter int) bool {
iter := zipIterator{raw}
for i := 0; i < stopAfter; i++ {
f := iter.next()
if len(f) == 0 {
break
}
if searchFor.match(f) {
return true
}
// If the first is not one of the next usually expected entries,
// then abort this check.
if i == 0 {
if !bytes.Equal(f, []byte("[Content_Types].xml")) &&
!bytes.Equal(f, []byte("_rels/.rels")) &&
!bytes.Equal(f, []byte("docProps")) &&
!bytes.Equal(f, []byte("customXml")) &&
!bytes.Equal(f, []byte("[trash]")) {
return false
}
}
}
return false
}
// next extracts the name of the next zip entry.
func (i *zipIterator) next() []byte {
pk := []byte("PK\003\004")
n := bytes.Index(i.b, pk)
if n == -1 {
return nil
}
i.b.Advance(n)
if !i.b.Advance(0x1A) {
return nil
}
l, ok := i.b.Uint16()
if !ok {
return nil
}
if !i.b.Advance(0x02) {
return nil
}
if len(i.b) < int(l) {
return nil
}
return i.b[:l]
}
// APK matches an Android Package Archive.
// The source of signatures is https://github.com/file/file/blob/1778642b8ba3d947a779a36fcd81f8e807220a19/magic/Magdir/archive#L1820-L1887
func APK(raw []byte, _ uint32) bool {
return zipHas(raw, zipEntries{{
name: []byte("AndroidManifest.xml"),
}, {
name: []byte("META-INF/com/android/build/gradle/app-metadata.properties"),
}, {
name: []byte("classes.dex"),
}, {
name: []byte("resources.arsc"),
}, {
name: []byte("res/drawable"),
}}, 100)
}

View File

@@ -0,0 +1,103 @@
// Package markup implements functions for extracting info from
// HTML and XML documents.
package markup
import (
"bytes"
"github.com/gabriel-vasile/mimetype/internal/scan"
)
func GetAnAttribute(s *scan.Bytes) (name, val string, hasMore bool) {
for scan.ByteIsWS(s.Peek()) || s.Peek() == '/' {
s.Advance(1)
}
if s.Peek() == '>' {
return "", "", false
}
// Allocate 10 to avoid resizes.
// Attribute names and values are continuous slices of bytes in input,
// so we could do without allocating and returning slices of input.
nameB := make([]byte, 0, 10)
// step 4 and 5
for {
// bap means byte at position in the specification.
bap := s.Pop()
if bap == 0 {
return "", "", false
}
if bap == '=' && len(nameB) > 0 {
val, hasMore := getAValue(s)
return string(nameB), string(val), hasMore
} else if scan.ByteIsWS(bap) {
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
if s.Peek() != '=' {
return string(nameB), "", true
}
s.Advance(1)
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
val, hasMore := getAValue(s)
return string(nameB), string(val), hasMore
} else if bap == '/' || bap == '>' {
return string(nameB), "", false
} else if bap >= 'A' && bap <= 'Z' {
nameB = append(nameB, bap+0x20)
} else {
nameB = append(nameB, bap)
}
}
}
func getAValue(s *scan.Bytes) (_ []byte, hasMore bool) {
for scan.ByteIsWS(s.Peek()) {
s.Advance(1)
}
origS, end := *s, 0
bap := s.Pop()
if bap == 0 {
return nil, false
}
end++
// Step 10
switch bap {
case '"', '\'':
val := s.PopUntil(bap)
if s.Pop() != bap {
return nil, false
}
return val, s.Peek() != 0 && s.Peek() != '>'
case '>':
return nil, false
}
// Step 11
for {
bap = s.Pop()
if bap == 0 {
return nil, false
}
switch {
case scan.ByteIsWS(bap):
return origS[:end], true
case bap == '>':
return origS[:end], false
default:
end++
}
}
}
func SkipAComment(s *scan.Bytes) (skipped bool) {
if bytes.HasPrefix(*s, []byte("<!--")) {
// Offset by 2 len(<!) because the starting and ending -- can be the same.
if i := bytes.Index((*s)[2:], []byte("-->")); i != -1 {
s.Advance(i + 2 + 3) // 2 comes from len(<!) and 3 comes from len(-->).
return true
}
}
return false
}

View File

@@ -0,0 +1,213 @@
// Package scan has functions for scanning byte slices.
package scan
import (
"bytes"
"encoding/binary"
)
// Bytes is a byte slice with helper methods for easier scanning.
type Bytes []byte
func (b *Bytes) Advance(n int) bool {
if n < 0 || len(*b) < n {
return false
}
*b = (*b)[n:]
return true
}
// TrimLWS trims whitespace from beginning of the bytes.
func (b *Bytes) TrimLWS() {
firstNonWS := 0
for ; firstNonWS < len(*b) && ByteIsWS((*b)[firstNonWS]); firstNonWS++ {
}
*b = (*b)[firstNonWS:]
}
// TrimRWS trims whitespace from the end of the bytes.
func (b *Bytes) TrimRWS() {
lb := len(*b)
for lb > 0 && ByteIsWS((*b)[lb-1]) {
*b = (*b)[:lb-1]
lb--
}
}
// Peek one byte from b or 0x00 if b is empty.
func (b *Bytes) Peek() byte {
if len(*b) > 0 {
return (*b)[0]
}
return 0
}
// Pop one byte from b or 0x00 if b is empty.
func (b *Bytes) Pop() byte {
if len(*b) > 0 {
ret := (*b)[0]
*b = (*b)[1:]
return ret
}
return 0
}
// PopN pops n bytes from b or nil if b is empty.
func (b *Bytes) PopN(n int) []byte {
if len(*b) >= n {
ret := (*b)[:n]
*b = (*b)[n:]
return ret
}
return nil
}
// PopUntil will advance b until, but not including, the first occurence of stopAt
// character. If no occurence is found, then it will advance until the end of b.
// The returned Bytes is a slice of all the bytes that we're advanced over.
func (b *Bytes) PopUntil(stopAt ...byte) Bytes {
if len(*b) == 0 {
return Bytes{}
}
i := bytes.IndexAny(*b, string(stopAt))
if i == -1 {
i = len(*b)
}
prefix := (*b)[:i]
*b = (*b)[i:]
return Bytes(prefix)
}
// ReadSlice is the same as PopUntil, but the returned value includes stopAt as well.
func (b *Bytes) ReadSlice(stopAt byte) Bytes {
if len(*b) == 0 {
return Bytes{}
}
i := bytes.IndexByte(*b, stopAt)
if i == -1 {
i = len(*b)
} else {
i++
}
prefix := (*b)[:i]
*b = (*b)[i:]
return Bytes(prefix)
}
// Line returns the first line from b and advances b with the length of the
// line. One new line character is trimmed after the line if it exists.
func (b *Bytes) Line() Bytes {
line := b.PopUntil('\n')
lline := len(line)
if lline > 0 && line[lline-1] == '\r' {
line = line[:lline-1]
}
b.Advance(1)
return line
}
// DropLastLine drops the last incomplete line from b.
//
// mimetype limits itself to ReadLimit bytes when performing a detection.
// This means, for file formats like CSV for NDJSON, the last line of the input
// can be an incomplete line.
// If b length is less than readLimit, it means we received an incomplete file
// and proceed with dropping the last line.
func (b *Bytes) DropLastLine(readLimit uint32) {
if readLimit == 0 || uint32(len(*b)) < readLimit {
return
}
for i := len(*b) - 1; i > 0; i-- {
if (*b)[i] == '\n' {
*b = (*b)[:i]
return
}
}
}
func (b *Bytes) Uint16() (uint16, bool) {
if len(*b) < 2 {
return 0, false
}
v := binary.LittleEndian.Uint16(*b)
*b = (*b)[2:]
return v, true
}
const (
CompactWS = 1 << iota
IgnoreCase
)
// Search for occurences of pattern p inside b at any index.
func (b Bytes) Search(p []byte, flags int) int {
if flags == 0 {
return bytes.Index(b, p)
}
lb, lp := len(b), len(p)
for i := range b {
if lb-i < lp {
return -1
}
if b[i:].Match(p, flags) {
return i
}
}
return 0
}
// Match pattern p at index 0 of b.
func (b Bytes) Match(p []byte, flags int) bool {
for len(b) > 0 {
// If we finished all we we're looking for from p.
if len(p) == 0 {
return true
}
if flags&IgnoreCase > 0 && isUpper(p[0]) {
if upper(b[0]) != p[0] {
return false
}
b, p = b[1:], p[1:]
} else if flags&CompactWS > 0 && ByteIsWS(p[0]) {
p = p[1:]
if !ByteIsWS(b[0]) {
return false
}
b = b[1:]
if !ByteIsWS(p[0]) {
b.TrimLWS()
}
} else {
if b[0] != p[0] {
return false
}
b, p = b[1:], p[1:]
}
}
return true
}
func isUpper(c byte) bool {
return c >= 'A' && c <= 'Z'
}
func upper(c byte) byte {
if c >= 'a' && c <= 'z' {
return c - ('a' - 'A')
}
return c
}
func ByteIsWS(b byte) bool {
return b == '\t' || b == '\n' || b == '\x0c' || b == '\r' || b == ' '
}
var (
ASCIISpaces = []byte{' ', '\r', '\n', '\x0c', '\t'}
ASCIIDigits = []byte{'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'}
)

188
vendor/github.com/gabriel-vasile/mimetype/mime.go generated vendored Normal file
View File

@@ -0,0 +1,188 @@
package mimetype
import (
"mime"
"github.com/gabriel-vasile/mimetype/internal/charset"
"github.com/gabriel-vasile/mimetype/internal/magic"
)
// MIME struct holds information about a file format: the string representation
// of the MIME type, the extension and the parent file format.
type MIME struct {
mime string
aliases []string
extension string
// detector receives the raw input and a limit for the number of bytes it is
// allowed to check. It returns whether the input matches a signature or not.
detector magic.Detector
children []*MIME
parent *MIME
}
// String returns the string representation of the MIME type, e.g., "application/zip".
func (m *MIME) String() string {
return m.mime
}
// Extension returns the file extension associated with the MIME type.
// It includes the leading dot, as in ".html". When the file format does not
// have an extension, the empty string is returned.
func (m *MIME) Extension() string {
return m.extension
}
// Parent returns the parent MIME type from the hierarchy.
// Each MIME type has a non-nil parent, except for the root MIME type.
//
// For example, the application/json and text/html MIME types have text/plain as
// their parent because they are text files who happen to contain JSON or HTML.
// Another example is the ZIP format, which is used as container
// for Microsoft Office files, EPUB files, JAR files, and others.
func (m *MIME) Parent() *MIME {
return m.parent
}
// Is checks whether this MIME type, or any of its aliases, is equal to the
// expected MIME type. MIME type equality test is done on the "type/subtype"
// section, ignores any optional MIME parameters, ignores any leading and
// trailing whitespace, and is case insensitive.
func (m *MIME) Is(expectedMIME string) bool {
// Parsing is needed because some detected MIME types contain parameters
// that need to be stripped for the comparison.
expectedMIME, _, _ = mime.ParseMediaType(expectedMIME)
found, _, _ := mime.ParseMediaType(m.mime)
if expectedMIME == found {
return true
}
for _, alias := range m.aliases {
if alias == expectedMIME {
return true
}
}
return false
}
func newMIME(
mime, extension string,
detector magic.Detector,
children ...*MIME) *MIME {
m := &MIME{
mime: mime,
extension: extension,
detector: detector,
children: children,
}
for _, c := range children {
c.parent = m
}
return m
}
func (m *MIME) alias(aliases ...string) *MIME {
m.aliases = aliases
return m
}
// match does a depth-first search on the signature tree. It returns the deepest
// successful node for which all the children detection functions fail.
func (m *MIME) match(in []byte, readLimit uint32) *MIME {
for _, c := range m.children {
if c.detector(in, readLimit) {
return c.match(in, readLimit)
}
}
needsCharset := map[string]func([]byte) string{
"text/plain": charset.FromPlain,
"text/html": charset.FromHTML,
"text/xml": charset.FromXML,
}
charset := ""
if f, ok := needsCharset[m.mime]; ok {
// The charset comes from BOM, from HTML headers, from XML headers.
// Limit the number of bytes searched for to 1024.
charset = f(in[:min(len(in), 1024)])
}
if m == root {
return m
}
return m.cloneHierarchy(charset)
}
// flatten transforms an hierarchy of MIMEs into a slice of MIMEs.
func (m *MIME) flatten() []*MIME {
out := []*MIME{m}
for _, c := range m.children {
out = append(out, c.flatten()...)
}
return out
}
// clone creates a new MIME with the provided optional MIME parameters.
func (m *MIME) clone(charset string) *MIME {
clonedMIME := m.mime
if charset != "" {
clonedMIME = m.mime + "; charset=" + charset
}
return &MIME{
mime: clonedMIME,
aliases: m.aliases,
extension: m.extension,
}
}
// cloneHierarchy creates a clone of m and all its ancestors. The optional MIME
// parameters are set on the last child of the hierarchy.
func (m *MIME) cloneHierarchy(charset string) *MIME {
ret := m.clone(charset)
lastChild := ret
for p := m.Parent(); p != nil; p = p.Parent() {
pClone := p.clone("")
lastChild.parent = pClone
lastChild = pClone
}
return ret
}
func (m *MIME) lookup(mime string) *MIME {
for _, n := range append(m.aliases, m.mime) {
if n == mime {
return m
}
}
for _, c := range m.children {
if m := c.lookup(mime); m != nil {
return m
}
}
return nil
}
// Extend adds detection for a sub-format. The detector is a function
// returning true when the raw input file satisfies a signature.
// The sub-format will be detected if all the detectors in the parent chain return true.
// The extension should include the leading dot, as in ".html".
func (m *MIME) Extend(detector func(raw []byte, limit uint32) bool, mime, extension string, aliases ...string) {
c := &MIME{
mime: mime,
extension: extension,
detector: detector,
parent: m,
aliases: aliases,
}
mu.Lock()
m.children = append([]*MIME{c}, m.children...)
mu.Unlock()
}

126
vendor/github.com/gabriel-vasile/mimetype/mimetype.go generated vendored Normal file
View File

@@ -0,0 +1,126 @@
// Package mimetype uses magic number signatures to detect the MIME type of a file.
//
// File formats are stored in a hierarchy with application/octet-stream at its root.
// For example, the hierarchy for HTML format is application/octet-stream ->
// text/plain -> text/html.
package mimetype
import (
"io"
"mime"
"os"
"sync/atomic"
)
var defaultLimit uint32 = 3072
// readLimit is the maximum number of bytes from the input used when detecting.
var readLimit uint32 = defaultLimit
// Detect returns the MIME type found from the provided byte slice.
//
// The result is always a valid MIME type, with application/octet-stream
// returned when identification failed.
func Detect(in []byte) *MIME {
// Using atomic because readLimit can be written at the same time in other goroutine.
l := atomic.LoadUint32(&readLimit)
if l > 0 && len(in) > int(l) {
in = in[:l]
}
mu.RLock()
defer mu.RUnlock()
return root.match(in, l)
}
// DetectReader returns the MIME type of the provided reader.
//
// The result is always a valid MIME type, with application/octet-stream
// returned when identification failed with or without an error.
// Any error returned is related to the reading from the input reader.
//
// DetectReader assumes the reader offset is at the start. If the input is an
// io.ReadSeeker you previously read from, it should be rewinded before detection:
//
// reader.Seek(0, io.SeekStart)
func DetectReader(r io.Reader) (*MIME, error) {
var in []byte
var err error
// Using atomic because readLimit can be written at the same time in other goroutine.
l := atomic.LoadUint32(&readLimit)
if l == 0 {
in, err = io.ReadAll(r)
if err != nil {
return errMIME, err
}
} else {
var n int
in = make([]byte, l)
// io.UnexpectedEOF means len(r) < len(in). It is not an error in this case,
// it just means the input file is smaller than the allocated bytes slice.
n, err = io.ReadFull(r, in)
if err != nil && err != io.EOF && err != io.ErrUnexpectedEOF {
return errMIME, err
}
in = in[:n]
}
mu.RLock()
defer mu.RUnlock()
return root.match(in, l), nil
}
// DetectFile returns the MIME type of the provided file.
//
// The result is always a valid MIME type, with application/octet-stream
// returned when identification failed with or without an error.
// Any error returned is related to the opening and reading from the input file.
func DetectFile(path string) (*MIME, error) {
f, err := os.Open(path)
if err != nil {
return errMIME, err
}
defer f.Close()
return DetectReader(f)
}
// EqualsAny reports whether s MIME type is equal to any MIME type in mimes.
// MIME type equality test is done on the "type/subtype" section, ignores
// any optional MIME parameters, ignores any leading and trailing whitespace,
// and is case insensitive.
func EqualsAny(s string, mimes ...string) bool {
s, _, _ = mime.ParseMediaType(s)
for _, m := range mimes {
m, _, _ = mime.ParseMediaType(m)
if s == m {
return true
}
}
return false
}
// SetLimit sets the maximum number of bytes read from input when detecting the MIME type.
// Increasing the limit provides better detection for file formats which store
// their magical numbers towards the end of the file: docx, pptx, xlsx, etc.
// During detection data is read in a single block of size limit, i.e. it is not buffered.
// A limit of 0 means the whole input file will be used.
func SetLimit(limit uint32) {
// Using atomic because readLimit can be read at the same time in other goroutine.
atomic.StoreUint32(&readLimit, limit)
}
// Extend adds detection for other file formats.
// It is equivalent to calling Extend() on the root mime type "application/octet-stream".
func Extend(detector func(raw []byte, limit uint32) bool, mime, extension string, aliases ...string) {
root.Extend(detector, mime, extension, aliases...)
}
// Lookup finds a MIME object by its string representation.
// The representation can be the main mime type, or any of its aliases.
func Lookup(mime string) *MIME {
mu.RLock()
defer mu.RUnlock()
return root.lookup(mime)
}

View File

@@ -0,0 +1,196 @@
## 191 Supported MIME types
This file is automatically generated when running tests. Do not edit manually.
Extension | MIME type | Aliases
--------- | --------- | -------
**n/a** | application/octet-stream | -
**.xpm** | image/x-xpixmap | -
**.7z** | application/x-7z-compressed | -
**.zip** | application/zip | application/x-zip, application/x-zip-compressed
**.docx** | application/vnd.openxmlformats-officedocument.wordprocessingml.document | -
**.pptx** | application/vnd.openxmlformats-officedocument.presentationml.presentation | -
**.xlsx** | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet | -
**.epub** | application/epub+zip | -
**.apk** | application/vnd.android.package-archive | -
**.jar** | application/java-archive | application/jar, application/jar-archive, application/x-java-archive
**.odt** | application/vnd.oasis.opendocument.text | application/x-vnd.oasis.opendocument.text
**.ott** | application/vnd.oasis.opendocument.text-template | application/x-vnd.oasis.opendocument.text-template
**.ods** | application/vnd.oasis.opendocument.spreadsheet | application/x-vnd.oasis.opendocument.spreadsheet
**.ots** | application/vnd.oasis.opendocument.spreadsheet-template | application/x-vnd.oasis.opendocument.spreadsheet-template
**.odp** | application/vnd.oasis.opendocument.presentation | application/x-vnd.oasis.opendocument.presentation
**.otp** | application/vnd.oasis.opendocument.presentation-template | application/x-vnd.oasis.opendocument.presentation-template
**.odg** | application/vnd.oasis.opendocument.graphics | application/x-vnd.oasis.opendocument.graphics
**.otg** | application/vnd.oasis.opendocument.graphics-template | application/x-vnd.oasis.opendocument.graphics-template
**.odf** | application/vnd.oasis.opendocument.formula | application/x-vnd.oasis.opendocument.formula
**.odc** | application/vnd.oasis.opendocument.chart | application/x-vnd.oasis.opendocument.chart
**.sxc** | application/vnd.sun.xml.calc | -
**.kmz** | application/vnd.google-earth.kmz | -
**.vsdx** | application/vnd.ms-visio.drawing.main+xml | -
**.pdf** | application/pdf | application/x-pdf
**.fdf** | application/vnd.fdf | -
**n/a** | application/x-ole-storage | -
**.msi** | application/x-ms-installer | application/x-windows-installer, application/x-msi
**.aaf** | application/octet-stream | -
**.msg** | application/vnd.ms-outlook | -
**.xls** | application/vnd.ms-excel | application/msexcel
**.pub** | application/vnd.ms-publisher | -
**.ppt** | application/vnd.ms-powerpoint | application/mspowerpoint
**.doc** | application/msword | application/vnd.ms-word
**.ps** | application/postscript | -
**.psd** | image/vnd.adobe.photoshop | image/x-psd, application/photoshop
**.p7s** | application/pkcs7-signature | -
**.ogg** | application/ogg | application/x-ogg
**.oga** | audio/ogg | -
**.ogv** | video/ogg | -
**.png** | image/png | -
**.png** | image/vnd.mozilla.apng | -
**.jpg** | image/jpeg | -
**.jxl** | image/jxl | -
**.jp2** | image/jp2 | -
**.jpf** | image/jpx | -
**.jpm** | image/jpm | video/jpm
**.jxs** | image/jxs | -
**.gif** | image/gif | -
**.webp** | image/webp | -
**.exe** | application/vnd.microsoft.portable-executable | -
**n/a** | application/x-elf | -
**n/a** | application/x-object | -
**n/a** | application/x-executable | -
**.so** | application/x-sharedlib | -
**n/a** | application/x-coredump | -
**.a** | application/x-archive | application/x-unix-archive
**.deb** | application/vnd.debian.binary-package | -
**.tar** | application/x-tar | -
**.xar** | application/x-xar | -
**.bz2** | application/x-bzip2 | -
**.fits** | application/fits | image/fits
**.tiff** | image/tiff | -
**.bmp** | image/bmp | image/x-bmp, image/x-ms-bmp
**.123** | application/vnd.lotus-1-2-3 | -
**.ico** | image/x-icon | -
**.mp3** | audio/mpeg | audio/x-mpeg, audio/mp3
**.flac** | audio/flac | -
**.midi** | audio/midi | audio/mid, audio/sp-midi, audio/x-mid, audio/x-midi
**.ape** | audio/ape | -
**.mpc** | audio/musepack | -
**.amr** | audio/amr | audio/amr-nb
**.wav** | audio/wav | audio/x-wav, audio/vnd.wave, audio/wave
**.aiff** | audio/aiff | audio/x-aiff
**.au** | audio/basic | -
**.mpeg** | video/mpeg | -
**.mov** | video/quicktime | -
**.mp4** | video/mp4 | -
**.avif** | image/avif | -
**.3gp** | video/3gpp | video/3gp, audio/3gpp
**.3g2** | video/3gpp2 | video/3g2, audio/3gpp2
**.mp4** | audio/mp4 | audio/x-mp4a
**.mqv** | video/quicktime | -
**.m4a** | audio/x-m4a | -
**.m4v** | video/x-m4v | -
**.heic** | image/heic | -
**.heic** | image/heic-sequence | -
**.heif** | image/heif | -
**.heif** | image/heif-sequence | -
**.mj2** | video/mj2 | -
**.dvb** | video/vnd.dvb.file | -
**.webm** | video/webm | audio/webm
**.avi** | video/x-msvideo | video/avi, video/msvideo
**.flv** | video/x-flv | -
**.mkv** | video/x-matroska | -
**.asf** | video/x-ms-asf | video/asf, video/x-ms-wmv
**.aac** | audio/aac | -
**.voc** | audio/x-unknown | -
**.m3u** | application/vnd.apple.mpegurl | audio/mpegurl
**.rmvb** | application/vnd.rn-realmedia-vbr | -
**.gz** | application/gzip | application/x-gzip, application/x-gunzip, application/gzipped, application/gzip-compressed, application/x-gzip-compressed, gzip/document
**.class** | application/x-java-applet | -
**.swf** | application/x-shockwave-flash | -
**.crx** | application/x-chrome-extension | -
**.ttf** | font/ttf | font/sfnt, application/x-font-ttf, application/font-sfnt
**.woff** | font/woff | -
**.woff2** | font/woff2 | -
**.otf** | font/otf | -
**.ttc** | font/collection | -
**.eot** | application/vnd.ms-fontobject | -
**.wasm** | application/wasm | -
**.shx** | application/vnd.shx | -
**.shp** | application/vnd.shp | -
**.dbf** | application/x-dbf | -
**.dcm** | application/dicom | -
**.rar** | application/x-rar-compressed | application/x-rar
**.djvu** | image/vnd.djvu | -
**.mobi** | application/x-mobipocket-ebook | -
**.lit** | application/x-ms-reader | -
**.bpg** | image/bpg | -
**.cbor** | application/cbor | -
**.sqlite** | application/vnd.sqlite3 | application/x-sqlite3
**.dwg** | image/vnd.dwg | image/x-dwg, application/acad, application/x-acad, application/autocad_dwg, application/dwg, application/x-dwg, application/x-autocad, drawing/dwg
**.nes** | application/vnd.nintendo.snes.rom | -
**.lnk** | application/x-ms-shortcut | -
**.macho** | application/x-mach-binary | -
**.qcp** | audio/qcelp | -
**.icns** | image/x-icns | -
**.hdr** | image/vnd.radiance | -
**.mrc** | application/marc | -
**.mdb** | application/x-msaccess | -
**.accdb** | application/x-msaccess | -
**.zst** | application/zstd | -
**.cab** | application/vnd.ms-cab-compressed | -
**.rpm** | application/x-rpm | -
**.xz** | application/x-xz | -
**.lz** | application/lzip | application/x-lzip
**.torrent** | application/x-bittorrent | -
**.cpio** | application/x-cpio | -
**n/a** | application/tzif | -
**.xcf** | image/x-xcf | -
**.pat** | image/x-gimp-pat | -
**.gbr** | image/x-gimp-gbr | -
**.glb** | model/gltf-binary | -
**.cab** | application/x-installshield | -
**.jxr** | image/jxr | image/vnd.ms-photo
**.parquet** | application/vnd.apache.parquet | application/x-parquet
**.one** | application/onenote | -
**.chm** | application/vnd.ms-htmlhelp | -
**.txt** | text/plain | -
**.svg** | image/svg+xml | -
**.html** | text/html | -
**.xml** | text/xml | application/xml
**.rss** | application/rss+xml | text/rss
**.atom** | application/atom+xml | -
**.x3d** | model/x3d+xml | -
**.kml** | application/vnd.google-earth.kml+xml | -
**.xlf** | application/x-xliff+xml | -
**.dae** | model/vnd.collada+xml | -
**.gml** | application/gml+xml | -
**.gpx** | application/gpx+xml | -
**.tcx** | application/vnd.garmin.tcx+xml | -
**.amf** | application/x-amf | -
**.3mf** | application/vnd.ms-package.3dmanufacturing-3dmodel+xml | -
**.xfdf** | application/vnd.adobe.xfdf | -
**.owl** | application/owl+xml | -
**.html** | application/xhtml+xml | -
**.php** | text/x-php | -
**.js** | text/javascript | application/x-javascript, application/javascript
**.lua** | text/x-lua | -
**.pl** | text/x-perl | -
**.py** | text/x-python | text/x-script.python, application/x-python
**.rb** | text/x-ruby | application/x-ruby
**.json** | application/json | -
**.geojson** | application/geo+json | -
**.har** | application/json | -
**.gltf** | model/gltf+json | -
**.ndjson** | application/x-ndjson | -
**.rtf** | text/rtf | application/rtf
**.srt** | application/x-subrip | application/x-srt, text/x-srt
**.tcl** | text/x-tcl | application/x-tcl
**.csv** | text/csv | -
**.tsv** | text/tab-separated-values | -
**.vcf** | text/vcard | -
**.ics** | text/calendar | -
**.warc** | application/warc | -
**.vtt** | text/vtt | -
**.sh** | text/x-shellscript | text/x-sh, application/x-shellscript, application/x-sh
**.pbm** | image/x-portable-bitmap | -
**.pgm** | image/x-portable-graymap | -
**.ppm** | image/x-portable-pixmap | -
**.pam** | image/x-portable-arbitrarymap | -

289
vendor/github.com/gabriel-vasile/mimetype/tree.go generated vendored Normal file
View File

@@ -0,0 +1,289 @@
package mimetype
import (
"sync"
"github.com/gabriel-vasile/mimetype/internal/magic"
)
// mimetype stores the list of MIME types in a tree structure with
// "application/octet-stream" at the root of the hierarchy. The hierarchy
// approach minimizes the number of checks that need to be done on the input
// and allows for more precise results once the base type of file has been
// identified.
//
// root is a detector which passes for any slice of bytes.
// When a detector passes the check, the children detectors
// are tried in order to find a more accurate MIME type.
var root = newMIME("application/octet-stream", "",
func([]byte, uint32) bool { return true },
xpm, sevenZ, zip, pdf, fdf, ole, ps, psd, p7s, ogg, png, jpg, jxl, jp2, jpx,
jpm, jxs, gif, webp, exe, elf, ar, tar, xar, bz2, fits, tiff, bmp, lotus, ico,
mp3, flac, midi, ape, musePack, amr, wav, aiff, au, mpeg, quickTime, mp4, webM,
avi, flv, mkv, asf, aac, voc, m3u, rmvb, gzip, class, swf, crx, ttf, woff,
woff2, otf, ttc, eot, wasm, shx, dbf, dcm, rar, djvu, mobi, lit, bpg, cbor,
sqlite3, dwg, nes, lnk, macho, qcp, icns, hdr, mrc, mdb, accdb, zstd, cab,
rpm, xz, lzip, torrent, cpio, tzif, xcf, pat, gbr, glb, cabIS, jxr, parquet,
oneNote, chm,
// Keep text last because it is the slowest check.
text,
)
// errMIME is returned from Detect functions when err is not nil.
// Detect could return root for erroneous cases, but it needs to lock mu in order to do so.
// errMIME is same as root but it does not require locking.
var errMIME = newMIME("application/octet-stream", "", func([]byte, uint32) bool { return false })
// mu guards access to the root MIME tree. Access to root must be synchronized with this lock.
var mu = &sync.RWMutex{}
// The list of nodes appended to the root node.
var (
xz = newMIME("application/x-xz", ".xz", magic.Xz)
gzip = newMIME("application/gzip", ".gz", magic.Gzip).alias(
"application/x-gzip", "application/x-gunzip", "application/gzipped",
"application/gzip-compressed", "application/x-gzip-compressed",
"gzip/document")
sevenZ = newMIME("application/x-7z-compressed", ".7z", magic.SevenZ)
// APK must be checked before JAR because APK is a subset of JAR.
// This means APK should be a child of JAR detector, but in practice,
// the decisive signature for JAR might be located at the end of the file
// and not reachable because of library readLimit.
zip = newMIME("application/zip", ".zip", magic.Zip, docx, pptx, xlsx, epub, apk, jar, odt, ods, odp, odg, odf, odc, sxc, kmz, visio).
alias("application/x-zip", "application/x-zip-compressed")
tar = newMIME("application/x-tar", ".tar", magic.Tar)
xar = newMIME("application/x-xar", ".xar", magic.Xar)
bz2 = newMIME("application/x-bzip2", ".bz2", magic.Bz2)
pdf = newMIME("application/pdf", ".pdf", magic.PDF).
alias("application/x-pdf")
fdf = newMIME("application/vnd.fdf", ".fdf", magic.Fdf)
xlsx = newMIME("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", ".xlsx", magic.Xlsx)
docx = newMIME("application/vnd.openxmlformats-officedocument.wordprocessingml.document", ".docx", magic.Docx)
pptx = newMIME("application/vnd.openxmlformats-officedocument.presentationml.presentation", ".pptx", magic.Pptx)
visio = newMIME("application/vnd.ms-visio.drawing.main+xml", ".vsdx", magic.Visio)
epub = newMIME("application/epub+zip", ".epub", magic.Epub)
jar = newMIME("application/java-archive", ".jar", magic.Jar).
alias("application/jar", "application/jar-archive", "application/x-java-archive")
apk = newMIME("application/vnd.android.package-archive", ".apk", magic.APK)
ole = newMIME("application/x-ole-storage", "", magic.Ole, msi, aaf, msg, xls, pub, ppt, doc)
msi = newMIME("application/x-ms-installer", ".msi", magic.Msi).
alias("application/x-windows-installer", "application/x-msi")
aaf = newMIME("application/octet-stream", ".aaf", magic.Aaf)
doc = newMIME("application/msword", ".doc", magic.Doc).
alias("application/vnd.ms-word")
ppt = newMIME("application/vnd.ms-powerpoint", ".ppt", magic.Ppt).
alias("application/mspowerpoint")
pub = newMIME("application/vnd.ms-publisher", ".pub", magic.Pub)
xls = newMIME("application/vnd.ms-excel", ".xls", magic.Xls).
alias("application/msexcel")
msg = newMIME("application/vnd.ms-outlook", ".msg", magic.Msg)
ps = newMIME("application/postscript", ".ps", magic.Ps)
fits = newMIME("application/fits", ".fits", magic.Fits).alias("image/fits")
ogg = newMIME("application/ogg", ".ogg", magic.Ogg, oggAudio, oggVideo).
alias("application/x-ogg")
oggAudio = newMIME("audio/ogg", ".oga", magic.OggAudio)
oggVideo = newMIME("video/ogg", ".ogv", magic.OggVideo)
text = newMIME("text/plain", ".txt", magic.Text, svg, html, xml, php, js, lua, perl, python, ruby, json, ndJSON, rtf, srt, tcl, csv, tsv, vCard, iCalendar, warc, vtt, shell, netpbm, netpgm, netppm, netpam)
xml = newMIME("text/xml", ".xml", magic.XML, rss, atom, x3d, kml, xliff, collada, gml, gpx, tcx, amf, threemf, xfdf, owl2, xhtml).
alias("application/xml")
xhtml = newMIME("application/xhtml+xml", ".html", magic.XHTML)
json = newMIME("application/json", ".json", magic.JSON, geoJSON, har, gltf)
har = newMIME("application/json", ".har", magic.HAR)
csv = newMIME("text/csv", ".csv", magic.CSV)
tsv = newMIME("text/tab-separated-values", ".tsv", magic.TSV)
geoJSON = newMIME("application/geo+json", ".geojson", magic.GeoJSON)
ndJSON = newMIME("application/x-ndjson", ".ndjson", magic.NdJSON)
html = newMIME("text/html", ".html", magic.HTML)
php = newMIME("text/x-php", ".php", magic.Php)
rtf = newMIME("text/rtf", ".rtf", magic.Rtf).alias("application/rtf")
js = newMIME("text/javascript", ".js", magic.Js).
alias("application/x-javascript", "application/javascript")
srt = newMIME("application/x-subrip", ".srt", magic.Srt).
alias("application/x-srt", "text/x-srt")
vtt = newMIME("text/vtt", ".vtt", magic.Vtt)
lua = newMIME("text/x-lua", ".lua", magic.Lua)
perl = newMIME("text/x-perl", ".pl", magic.Perl)
python = newMIME("text/x-python", ".py", magic.Python).
alias("text/x-script.python", "application/x-python")
ruby = newMIME("text/x-ruby", ".rb", magic.Ruby).
alias("application/x-ruby")
shell = newMIME("text/x-shellscript", ".sh", magic.Shell).
alias("text/x-sh", "application/x-shellscript", "application/x-sh")
tcl = newMIME("text/x-tcl", ".tcl", magic.Tcl).
alias("application/x-tcl")
vCard = newMIME("text/vcard", ".vcf", magic.VCard)
iCalendar = newMIME("text/calendar", ".ics", magic.ICalendar)
svg = newMIME("image/svg+xml", ".svg", magic.Svg)
rss = newMIME("application/rss+xml", ".rss", magic.Rss).
alias("text/rss")
owl2 = newMIME("application/owl+xml", ".owl", magic.Owl2)
atom = newMIME("application/atom+xml", ".atom", magic.Atom)
x3d = newMIME("model/x3d+xml", ".x3d", magic.X3d)
kml = newMIME("application/vnd.google-earth.kml+xml", ".kml", magic.Kml)
kmz = newMIME("application/vnd.google-earth.kmz", ".kmz", magic.KMZ)
xliff = newMIME("application/x-xliff+xml", ".xlf", magic.Xliff)
collada = newMIME("model/vnd.collada+xml", ".dae", magic.Collada)
gml = newMIME("application/gml+xml", ".gml", magic.Gml)
gpx = newMIME("application/gpx+xml", ".gpx", magic.Gpx)
tcx = newMIME("application/vnd.garmin.tcx+xml", ".tcx", magic.Tcx)
amf = newMIME("application/x-amf", ".amf", magic.Amf)
threemf = newMIME("application/vnd.ms-package.3dmanufacturing-3dmodel+xml", ".3mf", magic.Threemf)
png = newMIME("image/png", ".png", magic.Png, apng)
apng = newMIME("image/vnd.mozilla.apng", ".png", magic.Apng)
jpg = newMIME("image/jpeg", ".jpg", magic.Jpg)
jxl = newMIME("image/jxl", ".jxl", magic.Jxl)
jp2 = newMIME("image/jp2", ".jp2", magic.Jp2)
jpx = newMIME("image/jpx", ".jpf", magic.Jpx)
jpm = newMIME("image/jpm", ".jpm", magic.Jpm).
alias("video/jpm")
jxs = newMIME("image/jxs", ".jxs", magic.Jxs)
xpm = newMIME("image/x-xpixmap", ".xpm", magic.Xpm)
bpg = newMIME("image/bpg", ".bpg", magic.Bpg)
gif = newMIME("image/gif", ".gif", magic.Gif)
webp = newMIME("image/webp", ".webp", magic.Webp)
tiff = newMIME("image/tiff", ".tiff", magic.Tiff)
bmp = newMIME("image/bmp", ".bmp", magic.Bmp).
alias("image/x-bmp", "image/x-ms-bmp")
// lotus check must be done before ico because some ico detection is a bit
// relaxed and some lotus files are wrongfully identified as ico otherwise.
lotus = newMIME("application/vnd.lotus-1-2-3", ".123", magic.Lotus123)
ico = newMIME("image/x-icon", ".ico", magic.Ico)
icns = newMIME("image/x-icns", ".icns", magic.Icns)
psd = newMIME("image/vnd.adobe.photoshop", ".psd", magic.Psd).
alias("image/x-psd", "application/photoshop")
heic = newMIME("image/heic", ".heic", magic.Heic)
heicSeq = newMIME("image/heic-sequence", ".heic", magic.HeicSequence)
heif = newMIME("image/heif", ".heif", magic.Heif)
heifSeq = newMIME("image/heif-sequence", ".heif", magic.HeifSequence)
hdr = newMIME("image/vnd.radiance", ".hdr", magic.Hdr)
avif = newMIME("image/avif", ".avif", magic.AVIF)
mp3 = newMIME("audio/mpeg", ".mp3", magic.Mp3).
alias("audio/x-mpeg", "audio/mp3")
flac = newMIME("audio/flac", ".flac", magic.Flac)
midi = newMIME("audio/midi", ".midi", magic.Midi).
alias("audio/mid", "audio/sp-midi", "audio/x-mid", "audio/x-midi")
ape = newMIME("audio/ape", ".ape", magic.Ape)
musePack = newMIME("audio/musepack", ".mpc", magic.MusePack)
wav = newMIME("audio/wav", ".wav", magic.Wav).
alias("audio/x-wav", "audio/vnd.wave", "audio/wave")
aiff = newMIME("audio/aiff", ".aiff", magic.Aiff).alias("audio/x-aiff")
au = newMIME("audio/basic", ".au", magic.Au)
amr = newMIME("audio/amr", ".amr", magic.Amr).
alias("audio/amr-nb")
aac = newMIME("audio/aac", ".aac", magic.AAC)
voc = newMIME("audio/x-unknown", ".voc", magic.Voc)
aMp4 = newMIME("audio/mp4", ".mp4", magic.AMp4).
alias("audio/x-mp4a")
m4a = newMIME("audio/x-m4a", ".m4a", magic.M4a)
m3u = newMIME("application/vnd.apple.mpegurl", ".m3u", magic.M3u).
alias("audio/mpegurl")
m4v = newMIME("video/x-m4v", ".m4v", magic.M4v)
mj2 = newMIME("video/mj2", ".mj2", magic.Mj2)
dvb = newMIME("video/vnd.dvb.file", ".dvb", magic.Dvb)
mp4 = newMIME("video/mp4", ".mp4", magic.Mp4, avif, threeGP, threeG2, aMp4, mqv, m4a, m4v, heic, heicSeq, heif, heifSeq, mj2, dvb)
webM = newMIME("video/webm", ".webm", magic.WebM).
alias("audio/webm")
mpeg = newMIME("video/mpeg", ".mpeg", magic.Mpeg)
quickTime = newMIME("video/quicktime", ".mov", magic.QuickTime)
mqv = newMIME("video/quicktime", ".mqv", magic.Mqv)
threeGP = newMIME("video/3gpp", ".3gp", magic.ThreeGP).
alias("video/3gp", "audio/3gpp")
threeG2 = newMIME("video/3gpp2", ".3g2", magic.ThreeG2).
alias("video/3g2", "audio/3gpp2")
avi = newMIME("video/x-msvideo", ".avi", magic.Avi).
alias("video/avi", "video/msvideo")
flv = newMIME("video/x-flv", ".flv", magic.Flv)
mkv = newMIME("video/x-matroska", ".mkv", magic.Mkv)
asf = newMIME("video/x-ms-asf", ".asf", magic.Asf).
alias("video/asf", "video/x-ms-wmv")
rmvb = newMIME("application/vnd.rn-realmedia-vbr", ".rmvb", magic.Rmvb)
class = newMIME("application/x-java-applet", ".class", magic.Class)
swf = newMIME("application/x-shockwave-flash", ".swf", magic.SWF)
crx = newMIME("application/x-chrome-extension", ".crx", magic.CRX)
ttf = newMIME("font/ttf", ".ttf", magic.Ttf).
alias("font/sfnt", "application/x-font-ttf", "application/font-sfnt")
woff = newMIME("font/woff", ".woff", magic.Woff)
woff2 = newMIME("font/woff2", ".woff2", magic.Woff2)
otf = newMIME("font/otf", ".otf", magic.Otf)
ttc = newMIME("font/collection", ".ttc", magic.Ttc)
eot = newMIME("application/vnd.ms-fontobject", ".eot", magic.Eot)
wasm = newMIME("application/wasm", ".wasm", magic.Wasm)
shp = newMIME("application/vnd.shp", ".shp", magic.Shp)
shx = newMIME("application/vnd.shx", ".shx", magic.Shx, shp)
dbf = newMIME("application/x-dbf", ".dbf", magic.Dbf)
exe = newMIME("application/vnd.microsoft.portable-executable", ".exe", magic.Exe)
elf = newMIME("application/x-elf", "", magic.Elf, elfObj, elfExe, elfLib, elfDump)
elfObj = newMIME("application/x-object", "", magic.ElfObj)
elfExe = newMIME("application/x-executable", "", magic.ElfExe)
elfLib = newMIME("application/x-sharedlib", ".so", magic.ElfLib)
elfDump = newMIME("application/x-coredump", "", magic.ElfDump)
ar = newMIME("application/x-archive", ".a", magic.Ar, deb).
alias("application/x-unix-archive")
deb = newMIME("application/vnd.debian.binary-package", ".deb", magic.Deb)
rpm = newMIME("application/x-rpm", ".rpm", magic.RPM)
dcm = newMIME("application/dicom", ".dcm", magic.Dcm)
odt = newMIME("application/vnd.oasis.opendocument.text", ".odt", magic.Odt, ott).
alias("application/x-vnd.oasis.opendocument.text")
ott = newMIME("application/vnd.oasis.opendocument.text-template", ".ott", magic.Ott).
alias("application/x-vnd.oasis.opendocument.text-template")
ods = newMIME("application/vnd.oasis.opendocument.spreadsheet", ".ods", magic.Ods, ots).
alias("application/x-vnd.oasis.opendocument.spreadsheet")
ots = newMIME("application/vnd.oasis.opendocument.spreadsheet-template", ".ots", magic.Ots).
alias("application/x-vnd.oasis.opendocument.spreadsheet-template")
odp = newMIME("application/vnd.oasis.opendocument.presentation", ".odp", magic.Odp, otp).
alias("application/x-vnd.oasis.opendocument.presentation")
otp = newMIME("application/vnd.oasis.opendocument.presentation-template", ".otp", magic.Otp).
alias("application/x-vnd.oasis.opendocument.presentation-template")
odg = newMIME("application/vnd.oasis.opendocument.graphics", ".odg", magic.Odg, otg).
alias("application/x-vnd.oasis.opendocument.graphics")
otg = newMIME("application/vnd.oasis.opendocument.graphics-template", ".otg", magic.Otg).
alias("application/x-vnd.oasis.opendocument.graphics-template")
odf = newMIME("application/vnd.oasis.opendocument.formula", ".odf", magic.Odf).
alias("application/x-vnd.oasis.opendocument.formula")
odc = newMIME("application/vnd.oasis.opendocument.chart", ".odc", magic.Odc).
alias("application/x-vnd.oasis.opendocument.chart")
sxc = newMIME("application/vnd.sun.xml.calc", ".sxc", magic.Sxc)
rar = newMIME("application/x-rar-compressed", ".rar", magic.RAR).
alias("application/x-rar")
djvu = newMIME("image/vnd.djvu", ".djvu", magic.DjVu)
mobi = newMIME("application/x-mobipocket-ebook", ".mobi", magic.Mobi)
lit = newMIME("application/x-ms-reader", ".lit", magic.Lit)
sqlite3 = newMIME("application/vnd.sqlite3", ".sqlite", magic.Sqlite).
alias("application/x-sqlite3")
dwg = newMIME("image/vnd.dwg", ".dwg", magic.Dwg).
alias("image/x-dwg", "application/acad", "application/x-acad",
"application/autocad_dwg", "application/dwg", "application/x-dwg",
"application/x-autocad", "drawing/dwg")
warc = newMIME("application/warc", ".warc", magic.Warc)
nes = newMIME("application/vnd.nintendo.snes.rom", ".nes", magic.Nes)
lnk = newMIME("application/x-ms-shortcut", ".lnk", magic.Lnk)
macho = newMIME("application/x-mach-binary", ".macho", magic.MachO)
qcp = newMIME("audio/qcelp", ".qcp", magic.Qcp)
mrc = newMIME("application/marc", ".mrc", magic.Marc)
mdb = newMIME("application/x-msaccess", ".mdb", magic.MsAccessMdb)
accdb = newMIME("application/x-msaccess", ".accdb", magic.MsAccessAce)
zstd = newMIME("application/zstd", ".zst", magic.Zstd)
cab = newMIME("application/vnd.ms-cab-compressed", ".cab", magic.Cab)
cabIS = newMIME("application/x-installshield", ".cab", magic.InstallShieldCab)
lzip = newMIME("application/lzip", ".lz", magic.Lzip).alias("application/x-lzip")
torrent = newMIME("application/x-bittorrent", ".torrent", magic.Torrent)
cpio = newMIME("application/x-cpio", ".cpio", magic.Cpio)
tzif = newMIME("application/tzif", "", magic.TzIf)
p7s = newMIME("application/pkcs7-signature", ".p7s", magic.P7s)
xcf = newMIME("image/x-xcf", ".xcf", magic.Xcf)
pat = newMIME("image/x-gimp-pat", ".pat", magic.Pat)
gbr = newMIME("image/x-gimp-gbr", ".gbr", magic.Gbr)
xfdf = newMIME("application/vnd.adobe.xfdf", ".xfdf", magic.Xfdf)
glb = newMIME("model/gltf-binary", ".glb", magic.GLB)
gltf = newMIME("model/gltf+json", ".gltf", magic.GLTF)
jxr = newMIME("image/jxr", ".jxr", magic.Jxr).alias("image/vnd.ms-photo")
parquet = newMIME("application/vnd.apache.parquet", ".parquet", magic.Par1).
alias("application/x-parquet")
netpbm = newMIME("image/x-portable-bitmap", ".pbm", magic.NetPBM)
netpgm = newMIME("image/x-portable-graymap", ".pgm", magic.NetPGM)
netppm = newMIME("image/x-portable-pixmap", ".ppm", magic.NetPPM)
netpam = newMIME("image/x-portable-arbitrarymap", ".pam", magic.NetPAM)
cbor = newMIME("application/cbor", ".cbor", magic.CBOR)
oneNote = newMIME("application/onenote", ".one", magic.One)
chm = newMIME("application/vnd.ms-htmlhelp", ".chm", magic.CHM)
)

21
vendor/github.com/lmittmann/tint/LICENSE generated vendored Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2023 lmittmann
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

124
vendor/github.com/lmittmann/tint/README.md generated vendored Normal file
View File

@@ -0,0 +1,124 @@
# `tint`: 🌈 **slog.Handler** that writes tinted logs
[![Go Reference](https://pkg.go.dev/badge/github.com/lmittmann/tint.svg)](https://pkg.go.dev/github.com/lmittmann/tint#section-documentation)
[![Go Report Card](https://goreportcard.com/badge/github.com/lmittmann/tint)](https://goreportcard.com/report/github.com/lmittmann/tint)
<picture>
<source media="(prefers-color-scheme: dark)" srcset="https://github.com/lmittmann/tint/assets/3458786/3d42f8d5-8bdf-40db-a16a-1939c88689cb">
<source media="(prefers-color-scheme: light)" srcset="https://github.com/lmittmann/tint/assets/3458786/3d42f8d5-8bdf-40db-a16a-1939c88689cb">
<img src="https://github.com/lmittmann/tint/assets/3458786/3d42f8d5-8bdf-40db-a16a-1939c88689cb">
</picture>
<br>
<br>
Package `tint` implements a zero-dependency [`slog.Handler`](https://pkg.go.dev/log/slog#Handler)
that writes tinted (colorized) logs. Its output format is inspired by the `zerolog.ConsoleWriter` and
[`slog.TextHandler`](https://pkg.go.dev/log/slog#TextHandler).
The output format can be customized using [`Options`](https://pkg.go.dev/github.com/lmittmann/tint#Options)
which is a drop-in replacement for [`slog.HandlerOptions`](https://pkg.go.dev/log/slog#HandlerOptions).
```
go get github.com/lmittmann/tint
```
## Usage
```go
w := os.Stderr
// Create a new logger
logger := slog.New(tint.NewHandler(w, nil))
// Set global logger with custom options
slog.SetDefault(slog.New(
tint.NewHandler(w, &tint.Options{
Level: slog.LevelDebug,
TimeFormat: time.Kitchen,
}),
))
```
### Customize Attributes
`ReplaceAttr` can be used to alter or drop attributes. If set, it is called on
each non-group attribute before it is logged. See [`slog.HandlerOptions`](https://pkg.go.dev/log/slog#HandlerOptions)
for details.
```go
// Create a new logger with a custom TRACE level:
const LevelTrace = slog.LevelDebug - 4
w := os.Stderr
logger := slog.New(tint.NewHandler(w, &tint.Options{
Level: LevelTrace,
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Key == slog.LevelKey && len(groups) == 0 {
level, ok := a.Value.Any().(slog.Level)
if ok && level <= LevelTrace {
return tint.Attr(13, slog.String(a.Key, "TRC"))
}
}
return a
},
}))
```
```go
// Create a new logger that doesn't write the time
w := os.Stderr
logger := slog.New(
tint.NewHandler(w, &tint.Options{
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Key == slog.TimeKey && len(groups) == 0 {
return slog.Attr{}
}
return a
},
}),
)
```
```go
// Create a new logger that writes all errors in red
w := os.Stderr
logger := slog.New(
tint.NewHandler(w, &tint.Options{
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Value.Kind() == slog.KindAny {
if _, ok := a.Value.Any().(error); ok {
return tint.Attr(9, a)
}
}
return a
},
}),
)
```
### Automatically Enable Colors
Colors are enabled by default. Use the `Options.NoColor` field to disable
color output. To automatically enable colors based on terminal capabilities, use
e.g., the [`go-isatty`](https://github.com/mattn/go-isatty) package:
```go
w := os.Stderr
logger := slog.New(
tint.NewHandler(w, &tint.Options{
NoColor: !isatty.IsTerminal(w.Fd()),
}),
)
```
### Windows Support
Color support on Windows can be added by using e.g., the
[`go-colorable`](https://github.com/mattn/go-colorable) package:
```go
w := os.Stderr
logger := slog.New(
tint.NewHandler(colorable.NewColorable(w), nil),
)
```

40
vendor/github.com/lmittmann/tint/buffer.go generated vendored Normal file
View File

@@ -0,0 +1,40 @@
package tint
import "sync"
type buffer []byte
var bufPool = sync.Pool{
New: func() any {
b := make(buffer, 0, 1024)
return (*buffer)(&b)
},
}
func newBuffer() *buffer {
return bufPool.Get().(*buffer)
}
func (b *buffer) Free() {
// To reduce peak allocation, return only smaller buffers to the pool.
const maxBufferSize = 16 << 10
if cap(*b) <= maxBufferSize {
*b = (*b)[:0]
bufPool.Put(b)
}
}
func (b *buffer) Write(bytes []byte) (int, error) {
*b = append(*b, bytes...)
return len(bytes), nil
}
func (b *buffer) WriteByte(char byte) error {
*b = append(*b, char)
return nil
}
func (b *buffer) WriteString(str string) (int, error) {
*b = append(*b, str...)
return len(str), nil
}

754
vendor/github.com/lmittmann/tint/handler.go generated vendored Normal file
View File

@@ -0,0 +1,754 @@
/*
Package tint implements a zero-dependency [slog.Handler] that writes tinted
(colorized) logs. The output format is inspired by the [zerolog.ConsoleWriter]
and [slog.TextHandler].
The output format can be customized using [Options], which is a drop-in
replacement for [slog.HandlerOptions].
# Customize Attributes
Options.ReplaceAttr can be used to alter or drop attributes. If set, it is
called on each non-group attribute before it is logged.
See [slog.HandlerOptions] for details.
Create a new logger with a custom TRACE level:
const LevelTrace = slog.LevelDebug - 4
w := os.Stderr
logger := slog.New(tint.NewHandler(w, &tint.Options{
Level: LevelTrace,
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Key == slog.LevelKey && len(groups) == 0 {
level, ok := a.Value.Any().(slog.Level)
if ok && level <= LevelTrace {
return tint.Attr(13, slog.String(a.Key, "TRC"))
}
}
return a
},
}))
Create a new logger that doesn't write the time:
w := os.Stderr
logger := slog.New(
tint.NewHandler(w, &tint.Options{
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Key == slog.TimeKey && len(groups) == 0 {
return slog.Attr{}
}
return a
},
}),
)
Create a new logger that writes all errors in red:
w := os.Stderr
logger := slog.New(
tint.NewHandler(w, &tint.Options{
ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
if a.Value.Kind() == slog.KindAny {
if _, ok := a.Value.Any().(error); ok {
return tint.Attr(9, a)
}
}
return a
},
}),
)
# Automatically Enable Colors
Colors are enabled by default. Use the Options.NoColor field to disable
color output. To automatically enable colors based on terminal capabilities, use
e.g., the [go-isatty] package:
w := os.Stderr
logger := slog.New(
tint.NewHandler(w, &tint.Options{
NoColor: !isatty.IsTerminal(w.Fd()),
}),
)
# Windows Support
Color support on Windows can be added by using e.g., the [go-colorable] package:
w := os.Stderr
logger := slog.New(
tint.NewHandler(colorable.NewColorable(w), nil),
)
[zerolog.ConsoleWriter]: https://pkg.go.dev/github.com/rs/zerolog#ConsoleWriter
[go-isatty]: https://pkg.go.dev/github.com/mattn/go-isatty
[go-colorable]: https://pkg.go.dev/github.com/mattn/go-colorable
*/
package tint
import (
"context"
"encoding"
"fmt"
"io"
"log/slog"
"path/filepath"
"reflect"
"runtime"
"strconv"
"strings"
"sync"
"time"
"unicode"
"unicode/utf8"
)
const (
// ANSI modes
ansiEsc = '\u001b'
ansiReset = "\u001b[0m"
ansiFaint = "\u001b[2m"
ansiResetFaint = "\u001b[22m"
ansiBrightRed = "\u001b[91m"
ansiBrightGreen = "\u001b[92m"
ansiBrightYellow = "\u001b[93m"
errKey = "err"
defaultLevel = slog.LevelInfo
defaultTimeFormat = time.StampMilli
)
// Options for a slog.Handler that writes tinted logs. A zero Options consists
// entirely of default values.
//
// Options can be used as a drop-in replacement for [slog.HandlerOptions].
type Options struct {
// Enable source code location (Default: false)
AddSource bool
// Minimum level to log (Default: slog.LevelInfo)
Level slog.Leveler
// ReplaceAttr is called to rewrite each non-group attribute before it is logged.
// See https://pkg.go.dev/log/slog#HandlerOptions for details.
ReplaceAttr func(groups []string, attr slog.Attr) slog.Attr
// Time format (Default: time.StampMilli)
TimeFormat string
// Disable color (Default: false)
NoColor bool
}
func (o *Options) setDefaults() {
if o.Level == nil {
o.Level = defaultLevel
}
if o.TimeFormat == "" {
o.TimeFormat = defaultTimeFormat
}
}
// NewHandler creates a [slog.Handler] that writes tinted logs to Writer w,
// using the default options. If opts is nil, the default options are used.
func NewHandler(w io.Writer, opts *Options) slog.Handler {
if opts == nil {
opts = &Options{}
}
opts.setDefaults()
return &handler{
mu: &sync.Mutex{},
w: w,
opts: *opts,
}
}
// handler implements a [slog.Handler].
type handler struct {
attrsPrefix string
groupPrefix string
groups []string
mu *sync.Mutex
w io.Writer
opts Options
}
func (h *handler) clone() *handler {
return &handler{
attrsPrefix: h.attrsPrefix,
groupPrefix: h.groupPrefix,
groups: h.groups,
mu: h.mu, // mutex shared among all clones of this handler
w: h.w,
opts: h.opts,
}
}
func (h *handler) Enabled(_ context.Context, level slog.Level) bool {
return level >= h.opts.Level.Level()
}
func (h *handler) Handle(_ context.Context, r slog.Record) error {
// get a buffer from the sync pool
buf := newBuffer()
defer buf.Free()
rep := h.opts.ReplaceAttr
// write time
if !r.Time.IsZero() {
val := r.Time.Round(0) // strip monotonic to match Attr behavior
if rep == nil {
h.appendTintTime(buf, r.Time, -1)
buf.WriteByte(' ')
} else if a := rep(nil /* groups */, slog.Time(slog.TimeKey, val)); a.Key != "" {
val, color := h.resolve(a.Value)
if val.Kind() == slog.KindTime {
h.appendTintTime(buf, val.Time(), color)
} else {
h.appendTintValue(buf, val, false, color, true)
}
buf.WriteByte(' ')
}
}
// write level
if rep == nil {
h.appendTintLevel(buf, r.Level, -1)
buf.WriteByte(' ')
} else if a := rep(nil /* groups */, slog.Any(slog.LevelKey, r.Level)); a.Key != "" {
val, color := h.resolve(a.Value)
if val.Kind() == slog.KindAny {
if lvlVal, ok := val.Any().(slog.Level); ok {
h.appendTintLevel(buf, lvlVal, color)
} else {
h.appendTintValue(buf, val, false, color, false)
}
} else {
h.appendTintValue(buf, val, false, color, false)
}
buf.WriteByte(' ')
}
// write source
if h.opts.AddSource {
fs := runtime.CallersFrames([]uintptr{r.PC})
f, _ := fs.Next()
if f.File != "" {
src := &slog.Source{
Function: f.Function,
File: f.File,
Line: f.Line,
}
if rep == nil {
if h.opts.NoColor {
appendSource(buf, src)
} else {
buf.WriteString(ansiFaint)
appendSource(buf, src)
buf.WriteString(ansiReset)
}
buf.WriteByte(' ')
} else if a := rep(nil /* groups */, slog.Any(slog.SourceKey, src)); a.Key != "" {
val, color := h.resolve(a.Value)
h.appendTintValue(buf, val, false, color, true)
buf.WriteByte(' ')
}
}
}
// write message
if rep == nil {
buf.WriteString(r.Message)
buf.WriteByte(' ')
} else if a := rep(nil /* groups */, slog.String(slog.MessageKey, r.Message)); a.Key != "" {
val, color := h.resolve(a.Value)
h.appendTintValue(buf, val, false, color, false)
buf.WriteByte(' ')
}
// write handler attributes
if len(h.attrsPrefix) > 0 {
buf.WriteString(h.attrsPrefix)
}
// write attributes
r.Attrs(func(attr slog.Attr) bool {
h.appendAttr(buf, attr, h.groupPrefix, h.groups)
return true
})
if len(*buf) == 0 {
buf.WriteByte('\n')
} else {
(*buf)[len(*buf)-1] = '\n' // replace last space with newline
}
h.mu.Lock()
defer h.mu.Unlock()
_, err := h.w.Write(*buf)
return err
}
func (h *handler) WithAttrs(attrs []slog.Attr) slog.Handler {
if len(attrs) == 0 {
return h
}
h2 := h.clone()
buf := newBuffer()
defer buf.Free()
// write attributes to buffer
for _, attr := range attrs {
h.appendAttr(buf, attr, h.groupPrefix, h.groups)
}
h2.attrsPrefix = h.attrsPrefix + string(*buf)
return h2
}
func (h *handler) WithGroup(name string) slog.Handler {
if name == "" {
return h
}
h2 := h.clone()
h2.groupPrefix += name + "."
h2.groups = append(h2.groups, name)
return h2
}
func (h *handler) appendTintTime(buf *buffer, t time.Time, color int16) {
if h.opts.NoColor {
*buf = t.AppendFormat(*buf, h.opts.TimeFormat)
} else {
if color >= 0 {
appendAnsi(buf, uint8(color), true)
} else {
buf.WriteString(ansiFaint)
}
*buf = t.AppendFormat(*buf, h.opts.TimeFormat)
buf.WriteString(ansiReset)
}
}
func (h *handler) appendTintLevel(buf *buffer, level slog.Level, color int16) {
str := func(base string, val slog.Level) []byte {
if val == 0 {
return []byte(base)
} else if val > 0 {
return strconv.AppendInt(append([]byte(base), '+'), int64(val), 10)
}
return strconv.AppendInt([]byte(base), int64(val), 10)
}
if !h.opts.NoColor {
if color >= 0 {
appendAnsi(buf, uint8(color), false)
} else {
switch {
case level < slog.LevelInfo:
case level < slog.LevelWarn:
buf.WriteString(ansiBrightGreen)
case level < slog.LevelError:
buf.WriteString(ansiBrightYellow)
default:
buf.WriteString(ansiBrightRed)
}
}
}
switch {
case level < slog.LevelInfo:
buf.Write(str("DBG", level-slog.LevelDebug))
case level < slog.LevelWarn:
buf.Write(str("INF", level-slog.LevelInfo))
case level < slog.LevelError:
buf.Write(str("WRN", level-slog.LevelWarn))
default:
buf.Write(str("ERR", level-slog.LevelError))
}
if !h.opts.NoColor && level >= slog.LevelInfo {
buf.WriteString(ansiReset)
}
}
func appendSource(buf *buffer, src *slog.Source) {
dir, file := filepath.Split(src.File)
buf.WriteString(filepath.Join(filepath.Base(dir), file))
buf.WriteByte(':')
*buf = strconv.AppendInt(*buf, int64(src.Line), 10)
}
func (h *handler) resolve(val slog.Value) (resolvedVal slog.Value, color int16) {
if !h.opts.NoColor && val.Kind() == slog.KindLogValuer {
if tintVal, ok := val.Any().(tintValue); ok {
return tintVal.Value.Resolve(), int16(tintVal.Color)
}
}
return val.Resolve(), -1
}
func (h *handler) appendAttr(buf *buffer, attr slog.Attr, groupsPrefix string, groups []string) {
var color int16 // -1 if no color
attr.Value, color = h.resolve(attr.Value)
if rep := h.opts.ReplaceAttr; rep != nil && attr.Value.Kind() != slog.KindGroup {
attr = rep(groups, attr)
var colorRep int16
attr.Value, colorRep = h.resolve(attr.Value)
if colorRep >= 0 {
color = colorRep
}
}
if attr.Equal(slog.Attr{}) {
return
}
if attr.Value.Kind() == slog.KindGroup {
if attr.Key != "" {
groupsPrefix += attr.Key + "."
groups = append(groups, attr.Key)
}
for _, groupAttr := range attr.Value.Group() {
h.appendAttr(buf, groupAttr, groupsPrefix, groups)
}
return
}
if h.opts.NoColor {
h.appendKey(buf, attr.Key, groupsPrefix)
h.appendValue(buf, attr.Value, true)
} else {
if color >= 0 {
appendAnsi(buf, uint8(color), true)
h.appendKey(buf, attr.Key, groupsPrefix)
buf.WriteString(ansiResetFaint)
h.appendValue(buf, attr.Value, true)
buf.WriteString(ansiReset)
} else {
buf.WriteString(ansiFaint)
h.appendKey(buf, attr.Key, groupsPrefix)
buf.WriteString(ansiReset)
h.appendValue(buf, attr.Value, true)
}
}
buf.WriteByte(' ')
}
func (h *handler) appendKey(buf *buffer, key, groups string) {
appendString(buf, groups+key, true, !h.opts.NoColor)
buf.WriteByte('=')
}
func (h *handler) appendValue(buf *buffer, v slog.Value, quote bool) {
switch v.Kind() {
case slog.KindString:
appendString(buf, v.String(), quote, !h.opts.NoColor)
case slog.KindInt64:
*buf = strconv.AppendInt(*buf, v.Int64(), 10)
case slog.KindUint64:
*buf = strconv.AppendUint(*buf, v.Uint64(), 10)
case slog.KindFloat64:
*buf = strconv.AppendFloat(*buf, v.Float64(), 'g', -1, 64)
case slog.KindBool:
*buf = strconv.AppendBool(*buf, v.Bool())
case slog.KindDuration:
appendString(buf, v.Duration().String(), quote, !h.opts.NoColor)
case slog.KindTime:
*buf = appendRFC3339Millis(*buf, v.Time())
case slog.KindAny:
defer func() {
// Copied from log/slog/handler.go.
if r := recover(); r != nil {
// If it panics with a nil pointer, the most likely cases are
// an encoding.TextMarshaler or error fails to guard against nil,
// in which case "<nil>" seems to be the feasible choice.
//
// Adapted from the code in fmt/print.go.
if v := reflect.ValueOf(v.Any()); v.Kind() == reflect.Pointer && v.IsNil() {
buf.WriteString("<nil>")
return
}
// Otherwise just print the original panic message.
appendString(buf, fmt.Sprintf("!PANIC: %v", r), true, !h.opts.NoColor)
}
}()
switch cv := v.Any().(type) {
case encoding.TextMarshaler:
data, err := cv.MarshalText()
if err != nil {
break
}
appendString(buf, string(data), quote, !h.opts.NoColor)
case *slog.Source:
appendSource(buf, cv)
default:
appendString(buf, fmt.Sprintf("%+v", cv), quote, !h.opts.NoColor)
}
}
}
func (h *handler) appendTintValue(buf *buffer, val slog.Value, quote bool, color int16, faint bool) {
if h.opts.NoColor {
h.appendValue(buf, val, quote)
} else {
if color >= 0 {
appendAnsi(buf, uint8(color), faint)
} else if faint {
buf.WriteString(ansiFaint)
}
h.appendValue(buf, val, quote)
if color >= 0 || faint {
buf.WriteString(ansiReset)
}
}
}
// Copied from log/slog/handler.go.
func appendRFC3339Millis(b []byte, t time.Time) []byte {
// Format according to time.RFC3339Nano since it is highly optimized,
// but truncate it to use millisecond resolution.
// Unfortunately, that format trims trailing 0s, so add 1/10 millisecond
// to guarantee that there are exactly 4 digits after the period.
const prefixLen = len("2006-01-02T15:04:05.000")
n := len(b)
t = t.Truncate(time.Millisecond).Add(time.Millisecond / 10)
b = t.AppendFormat(b, time.RFC3339Nano)
b = append(b[:n+prefixLen], b[n+prefixLen+1:]...) // drop the 4th digit
return b
}
func appendAnsi(buf *buffer, color uint8, faint bool) {
buf.WriteString("\u001b[")
if faint {
buf.WriteString("2;")
}
if color < 8 {
*buf = strconv.AppendUint(*buf, uint64(color)+30, 10)
} else if color < 16 {
*buf = strconv.AppendUint(*buf, uint64(color)+82, 10)
} else {
buf.WriteString("38;5;")
*buf = strconv.AppendUint(*buf, uint64(color), 10)
}
buf.WriteByte('m')
}
func appendString(buf *buffer, s string, quote, color bool) {
if quote && !color {
// trim ANSI escape sequences
var inEscape bool
s = cut(s, func(r rune) bool {
if r == ansiEsc {
inEscape = true
} else if inEscape && unicode.IsLetter(r) {
inEscape = false
return true
}
return inEscape
})
}
quote = quote && needsQuoting(s)
switch {
case color && quote:
s = strconv.Quote(s)
s = strings.ReplaceAll(s, `\x1b`, string(ansiEsc))
buf.WriteString(s)
case !color && quote:
*buf = strconv.AppendQuote(*buf, s)
default:
buf.WriteString(s)
}
}
func cut(s string, f func(r rune) bool) string {
var res []rune
for i := 0; i < len(s); {
r, size := utf8.DecodeRuneInString(s[i:])
if r == utf8.RuneError {
break
}
if !f(r) {
res = append(res, r)
}
i += size
}
return string(res)
}
// Copied from log/slog/text_handler.go.
func needsQuoting(s string) bool {
if len(s) == 0 {
return true
}
for i := 0; i < len(s); {
b := s[i]
if b < utf8.RuneSelf {
// Quote anything except a backslash that would need quoting in a
// JSON string, as well as space and '='
if b != '\\' && (b == ' ' || b == '=' || !safeSet[b]) {
return true
}
i++
continue
}
r, size := utf8.DecodeRuneInString(s[i:])
if r == utf8.RuneError || unicode.IsSpace(r) || !unicode.IsPrint(r) {
return true
}
i += size
}
return false
}
// Copied from log/slog/json_handler.go.
//
// safeSet is extended by the ANSI escape code "\u001b".
var safeSet = [utf8.RuneSelf]bool{
' ': true,
'!': true,
'"': false,
'#': true,
'$': true,
'%': true,
'&': true,
'\'': true,
'(': true,
')': true,
'*': true,
'+': true,
',': true,
'-': true,
'.': true,
'/': true,
'0': true,
'1': true,
'2': true,
'3': true,
'4': true,
'5': true,
'6': true,
'7': true,
'8': true,
'9': true,
':': true,
';': true,
'<': true,
'=': true,
'>': true,
'?': true,
'@': true,
'A': true,
'B': true,
'C': true,
'D': true,
'E': true,
'F': true,
'G': true,
'H': true,
'I': true,
'J': true,
'K': true,
'L': true,
'M': true,
'N': true,
'O': true,
'P': true,
'Q': true,
'R': true,
'S': true,
'T': true,
'U': true,
'V': true,
'W': true,
'X': true,
'Y': true,
'Z': true,
'[': true,
'\\': false,
']': true,
'^': true,
'_': true,
'`': true,
'a': true,
'b': true,
'c': true,
'd': true,
'e': true,
'f': true,
'g': true,
'h': true,
'i': true,
'j': true,
'k': true,
'l': true,
'm': true,
'n': true,
'o': true,
'p': true,
'q': true,
'r': true,
's': true,
't': true,
'u': true,
'v': true,
'w': true,
'x': true,
'y': true,
'z': true,
'{': true,
'|': true,
'}': true,
'~': true,
'\u007f': true,
'\u001b': true,
}
type tintValue struct {
slog.Value
Color uint8
}
// LogValue implements the [slog.LogValuer] interface.
func (v tintValue) LogValue() slog.Value {
return v.Value
}
// Err returns a tinted (colorized) [slog.Attr] that will be written in red color
// by the [tint.Handler]. When used with any other [slog.Handler], it behaves as
//
// slog.Any("err", err)
func Err(err error) slog.Attr {
return Attr(9, slog.Any(errKey, err))
}
// Attr returns a tinted (colorized) [slog.Attr] that will be written in the
// specified color by the [tint.Handler]. When used with any other [slog.Handler], it behaves as a
// plain [slog.Attr].
//
// Use the uint8 color value to specify the color of the attribute:
//
// - 0-7: standard ANSI colors
// - 8-15: high intensity ANSI colors
// - 16-231: 216 colors (6×6×6 cube)
// - 232-255: grayscale from dark to light in 24 steps
//
// See https://en.wikipedia.org/wiki/ANSI_escape_code#8-bit
func Attr(color uint8, attr slog.Attr) slog.Attr {
attr.Value = slog.AnyValue(tintValue{attr.Value, color})
return attr
}

1
vendor/github.com/matoous/go-nanoid/v2/.gitignore generated vendored Normal file
View File

@@ -0,0 +1 @@
.idea

21
vendor/github.com/matoous/go-nanoid/v2/LICENSE generated vendored Normal file
View File

@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2018 Matous Dzivjak <matousdzivjak@gmail.com>
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

21
vendor/github.com/matoous/go-nanoid/v2/Makefile generated vendored Normal file
View File

@@ -0,0 +1,21 @@
# Make this makefile self-documented with target `help`
.PHONY: help
.DEFAULT_GOAL := help
help: ## Show help
@grep -Eh '^[0-9a-zA-Z_-]+:.*?## .*$$' $(MAKEFILE_LIST) | sort | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}'
.PHONY: lint
lint: download ## Lint the repository with golang-ci lint
golangci-lint run --max-same-issues 0 --max-issues-per-linter 0 $(if $(CI),--out-format code-climate > gl-code-quality-report.json 2>golangci-stderr-output)
.PHONY: test
test: download ## Run all tests
go test -v
.PHONY: bench
bench: download ## Run all benchmarks
go test -bench=.
.PHONY: download
download: ## Download dependencies
go mod download

55
vendor/github.com/matoous/go-nanoid/v2/README.md generated vendored Normal file
View File

@@ -0,0 +1,55 @@
# Go Nanoid
[![CI](https://github.com/matoous/go-nanoid/workflows/CI/badge.svg)](https://github.com/matoous/go-nanoid/actions)
[![GoDoc](https://godoc.org/github.com/matoous/go-nanoid?status.svg)](https://godoc.org/github.com/matoous/go-nanoid)
[![Go Report Card](https://goreportcard.com/badge/github.com/matoous/go-nanoid)](https://goreportcard.com/report/github.com/matoous/go-nanoid)
[![GitHub issues](https://img.shields.io/github/issues/matoous/go-nanoid.svg)](https://github.com/matoous/go-nanoid/issues)
[![License](https://img.shields.io/badge/license-MIT%20License-blue.svg)](https://github.com/matoous/go-nanoid/LICENSE)
This package is Go implementation of [ai's](https://github.com/ai) [nanoid](https://github.com/ai/nanoid)!
**Safe.** It uses cryptographically strong random generator.
**Compact.** It uses more symbols than UUID (`A-Za-z0-9_-`)
and has the same number of unique options in just 22 symbols instead of 36.
**Fast.** Nanoid is as fast as UUID but can be used in URLs.
There's also this alternative: https://github.com/jaevor/go-nanoid.
## Install
Via go get tool
``` bash
$ go get github.com/matoous/go-nanoid/v2
```
## Usage
Generate ID
``` go
id, err := gonanoid.New()
```
Generate ID with a custom alphabet and length
``` go
id, err := gonanoid.Generate("abcde", 54)
```
## Notice
If you use Go Nanoid in your project, please let me know!
If you have any issues, just feel free and open it in this repository, thanks!
## Credits
- [ai](https://github.com/ai) - [nanoid](https://github.com/ai/nanoid)
- icza - his tutorial on [random strings in Go](https://stackoverflow.com/questions/22892120/how-to-generate-a-random-string-of-a-fixed-length-in-golang)
## License
The MIT License (MIT). Please see [License File](LICENSE.md) for more information.

108
vendor/github.com/matoous/go-nanoid/v2/gonanoid.go generated vendored Normal file
View File

@@ -0,0 +1,108 @@
package gonanoid
import (
"crypto/rand"
"errors"
"math"
)
// defaultAlphabet is the alphabet used for ID characters by default.
var defaultAlphabet = []rune("_-0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
const (
defaultSize = 21
)
// getMask generates bit mask used to obtain bits from the random bytes that are used to get index of random character
// from the alphabet. Example: if the alphabet has 6 = (110)_2 characters it is sufficient to use mask 7 = (111)_2
func getMask(alphabetSize int) int {
for i := 1; i <= 8; i++ {
mask := (2 << uint(i)) - 1
if mask >= alphabetSize-1 {
return mask
}
}
return 0
}
// Generate is a low-level function to change alphabet and ID size.
func Generate(alphabet string, size int) (string, error) {
chars := []rune(alphabet)
if len(alphabet) == 0 || len(alphabet) > 255 {
return "", errors.New("alphabet must not be empty and contain no more than 255 chars")
}
if size <= 0 {
return "", errors.New("size must be positive integer")
}
mask := getMask(len(chars))
// estimate how many random bytes we will need for the ID, we might actually need more but this is tradeoff
// between average case and worst case
ceilArg := 1.6 * float64(mask*size) / float64(len(alphabet))
step := int(math.Ceil(ceilArg))
id := make([]rune, size)
bytes := make([]byte, step)
for j := 0; ; {
_, err := rand.Read(bytes)
if err != nil {
return "", err
}
for i := 0; i < step; i++ {
currByte := bytes[i] & byte(mask)
if currByte < byte(len(chars)) {
id[j] = chars[currByte]
j++
if j == size {
return string(id[:size]), nil
}
}
}
}
}
// MustGenerate is the same as Generate but panics on error.
func MustGenerate(alphabet string, size int) string {
id, err := Generate(alphabet, size)
if err != nil {
panic(err)
}
return id
}
// New generates secure URL-friendly unique ID.
// Accepts optional parameter - length of the ID to be generated (21 by default).
func New(l ...int) (string, error) {
var size int
switch {
case len(l) == 0:
size = defaultSize
case len(l) == 1:
size = l[0]
if size < 0 {
return "", errors.New("negative id length")
}
default:
return "", errors.New("unexpected parameter")
}
bytes := make([]byte, size)
_, err := rand.Read(bytes)
if err != nil {
return "", err
}
id := make([]rune, size)
for i := 0; i < size; i++ {
id[i] = defaultAlphabet[bytes[i]&63]
}
return string(id[:size]), nil
}
// Must is the same as New but panics on error.
func Must(l ...int) string {
id, err := New(l...)
if err != nil {
panic(err)
}
return id
}

19
vendor/modules.txt vendored Normal file
View File

@@ -0,0 +1,19 @@
# git.antanst.com/antanst/uid v0.0.1 => ../uid
## explicit; go 1.24.3
git.antanst.com/antanst/uid
# github.com/gabriel-vasile/mimetype v1.4.10
## explicit; go 1.21
github.com/gabriel-vasile/mimetype
github.com/gabriel-vasile/mimetype/internal/charset
github.com/gabriel-vasile/mimetype/internal/csv
github.com/gabriel-vasile/mimetype/internal/json
github.com/gabriel-vasile/mimetype/internal/magic
github.com/gabriel-vasile/mimetype/internal/markup
github.com/gabriel-vasile/mimetype/internal/scan
# github.com/lmittmann/tint v1.1.2
## explicit; go 1.21
github.com/lmittmann/tint
# github.com/matoous/go-nanoid/v2 v2.1.0
## explicit; go 1.20
github.com/matoous/go-nanoid/v2
# git.antanst.com/antanst/uid => ../uid