Navigation Menu

Skip to content

jhsimpson/siegfried

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Siegfried

Siegfried is a signature-based file format identification tool.

Key features are:

  • complete implementation of PRONOM (byte and container signatures)
  • fast matching without limiting the number of bytes scanned
  • detailed information about the basis for format matches
  • simple command line interface with a choice of outputs
  • a built-in server for integrating with workflows and language inter-op
  • power options including debug mode, signature modification, and multiple identifiers

Version

1.4.5

Build Status GoDoc

Usage

Command line

sf file.ext
sf DIR

Options

sf -csv file.ext | DIR                     // Output CSV rather than YAML
sf -json file.ext | DIR                    // Output JSON rather than YAML
sf -droid file.ext | DIR                   // Output DROID CSV rather than YAML
sf -                                       // Read list of files piped to stdin
sf -nr DIR                                 // Don't scan subdirectories
sf -z file.zip | DIR                       // Decompress and scan zip, tar, gzip, warc, arc
sf -hash md5 file.ext | DIR                // Calculate md5, sha1, sha256, sha512, or crc hash
sf -sig custom.sig file.ext                // Use a custom signature file
sf -home c:\junk -sig custom.sig file.ext  // Use a custom home directory
sf -serve hostname:port                    // Server mode
sf -version                                // Display version information
sf -throttle 10ms DIR                      // Pause for duration (e.g. 1s) between file scans
sf -log [comma-sep opts] file.ext | DIR    // Log errors etc. to stderr (default) or stdout
sf -log e,w file.ext | DIR                 // Log errors and warnings to stderr
sf -log u,o file.ext | DIR                 // Log unknowns to stdout
sf -log d,s file.ext | DIR                 // Log debugging and slow messages to stderr
sf -log p,t DIR > results.yaml             // Log progress and time while redirecting results

Example

asciicast

Signatures

By default, siegfried uses the latest PRONOM and container signatures with no buffer limits. You can customise your signature file by using the roy tool.

Install

With go installed:

go get github.com/richardlehane/siegfried/cmd/sf

sf -update

Or, without go installed:

Win:

Download a pre-built binary from the releases page. Unzip to a location in your system path. Then run:

sf -update

Mac Homebrew (or Linuxbrew):

brew install mistydemeo/digipres/siegfried

Ubuntu/Debian (64 bit):

wget -qO - https://bintray.com/user/downloadSubjectPublicKey?username=bintray | sudo apt-key add -
echo "deb http://dl.bintray.com/siegfried/debian wheezy main" | sudo tee -a /etc/apt/sources.list
sudo apt-get update && sudo apt-get install siegfried

Recent Changes

Version 1.4.5 (6/2/2016)

Version 1.4.4 (9/1/2016)

  • fix: speed regression in TIFF mis-identification patch last release
  • code quality: refactor textmatcher package
  • code quality: refactor siegreader package
  • code quality: documentation

Version 1.4.3 (19/12/2015)

Version 1.4.2 (27/11/2015)

Version 1.4.1 (6/11/2015)

  • -log replaces -debug, -slow, -unknown and -known flags (see usage above)
  • highlight empty file/stream with error and warning
  • negative text match overrides extension-only plain text match

Version 1.4.0 (31/10/2015)

  • new MIME matcher; requested by Dragan Espenschied
  • support warc continuations
  • add all.json and tiff.json sets
  • minor speed-up
  • report less redundant basis information
  • report error on empty file/stream

Full change history

Rights

Copyright 2016 Richard Lehane

Licensed under the Apache License, Version 2.0

Contributing

Like siegfried and want to get involved in its development? That'd be wonderful! There are some notes on the wiki to get you started, and please get in touch.

Thanks

Thanks TNA for http://www.nationalarchives.gov.uk/pronom/ and http://www.nationalarchives.gov.uk/information-management/projects-and-work/droid.htm

Thanks Ross for https://github.com/exponential-decay/skeleton-test-suite-generator and http://exponentialdecay.co.uk/sd/index.htm, both are very handy!

Thanks Misty for the brew and ubuntu packaging

About

signature-based file format identification

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 66.4%
  • TeX 33.6%
  • PostScript 0.0%
  • ActionScript 0.0%
  • Shell 0.0%
  • Squirrel 0.0%