GitHub - jimmyfrasche/htmlrep: basic reports about legacy html

#htmlrep Command htmlrep prints a report of tags and links used in a UTF-8 encoded HTML blob read from stdin.

Download:

go get github.com/jimmyfrasche/htmlrep

If you do not have the go command on your system, you need to Install Go first

Command htmlrep prints a report of tags and links used in a UTF-8 encoded HTML blob read from stdin.

The document is never parsed, only tokenized, so many documents may be concatenated together.

These reports are useful when preparing to migrate legacy data into a new web site.

There are three reports: tags and attributes, links in attributes, and links in text nodes.

The tags and attributes report lists each tag used in the blob on a line followed by all attributes used on all instances of that tag, where each attribute is indented by one tab. The reports are separated by blank lines.

Both link reports list all unique links, one per line, in the blob. The links in attributes reports links found in all attributes known to contain links. The links on content scans text nodes for things that may be links, using a number of heuristics to cull false positives, which, while unicode aware, are largely English-centric.

By default all reports are shown, but some may be hidden using the following flags:

-t	only show tags and attributes report
-l	only show the links reports
-c	of the links reports, only show links from text nodes
-a	of the links reports, only show links from attributes

##EXAMPLES Show all reports

cat *.html | htmlrep

Show only tags and attributes reports

cat *.html | htmlrep -t

Show only links reports

cat *.html | htmlrep -l

Show only probable links from text nodes:

cat *.html | htmlrep -l -c

Show only links from attributes:

cat *.html | htmlrep -l -a

Automatically generated by autoreadme on 2016.07.03

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dep.log		dep.log
dict.go		dict.go
htmlrep.1		htmlrep.1
htmlrep.go		htmlrep.go
keys.go		keys.go
links.go		links.go
render.go		render.go
tok.go		tok.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

dep.log

dep.log

dict.go

dict.go

htmlrep.1

htmlrep.1

htmlrep.go

htmlrep.go

keys.go

keys.go

links.go

links.go

render.go

render.go

tok.go

tok.go

Repository files navigation

About

Releases

Packages

Languages

License

jimmyfrasche/htmlrep

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Languages