Skip to content

drbig/golang

Repository files navigation

Experiments in Golang

Another assorted collection of fun in Golang.

If you have any feedback drop me a line at p.staszewski@gmail.com or look for dRbiG on FreeNode.

Proper stuff

Grabber is a concurrent declarative web scraper and downloader.

This is the rewrite of grabber.go.

Camtagger is a simple command-line mass-tagger for Camlistore.

If you're not interested in Camlistore (but you should be) this will be of no use to you.

AdHole is a simple transparent advertisement and tracking blocker intended for personal use. Simple, minimal, low-level.

Working stuff

getlinks.go

Extract links (or actually anything) based on regexps from websites. One major feature is support to go through 'all pages' by following a 'Next' or alike link (again, regexp matching). You can have more than one extracting regexp per URL, and they are processed in parallel. I guess you can do similar stuff with httrack or maybe even curl/wget, but I personally prefer to be able to extract all links so I can run wget on the right machine in the right time. The code is not commented, but then it should be reasonably easy to understand anyway.

arpapp.go

A 'webapp' that discovers IPs via ARP cache and then ping-checks them - to bring you a succinct report of what is online and what is offline in your neighbourhood. It considers itself only with the last state change (keeping both the 'since' and 'time elapsed'). It can clean old entries based on a Regexp match. Latest addition is the ability to read standard zone files, so that a hostname can be displayed alongside the IP address. Tested on FreeBSD and Linux, works like a charm.

dhproxy.go

Thanks to goproxy this is basically a 'one-liner' - it's a simple HTTP proxy that will dump the URI of each request that goes through it to stdout.

wol.go

Simple implementation of Wake-On-Lan. This is a part of my another 'project' where I implemented this functionality in a number of languages (currently C, Go, Ocaml, Ruby and Scheme). Before running you should edit the broadcast IP and the static MAC table. Each argument will be first looked up in the table, if that fails it will be treated as an MAC address. In both cases if the parsing goes ok a magic packet will be sent three times (without delay). There is only rudimentary error checking. Tested only on Linux (where it works).

pomf.go

Rather simple uploader for pomf.se. Simple example of using mime/multipart post a file and encoding/json to parse the results. Without arguments it will try to upload whatever comes on stdin. Otherwise you can specify filenames as arguments, and it will try to upload each file. Exits successfully only if all uploads were successful and dies very quickly on almost any error.

grabber.go

Updates:

No longer maintained here. Please find the rewritten version here.

Added basic support for headers injection. To use add headers to your target definition (usual hash). Note that this gives you also the ability for basic cookie injection. Warning: if you have multiple targets defined in a single file be aware that headers are currently read from a global variable, so there might be problems 'between the targets' (i.e. an edge case where a link has been pushed to queue from target 1 but gets processed when target 2 is set as the source of headers). Obvious work-around is to use one target per file.

Added new mode single. Useful when results are paginated in chronological order - as you don't know the last page's link you can use single to go to the 'last page' and then follow every 'previous page' link.

Added basic downloading statistics (number of files, total size and average download speed).

General:

Purely declarative web scrapper with parallel downloading. Intended to automate the common pattern found in websites: archive page pagination - post links - resource links. You should open unixporn.json example to see the structure of a target definition. Non-URL data extraction example is in golang.json.

There are two modes: follow - use for pagination, and every - use for link extraction. Then there are three actions that can be taken: print - will print the link to stdout, log - will log the link (either stdout or stdin, depending on the state of the -log flag), download - will send the link to the built-in parallel downloader (which will download only if the file doesn't already exist), and finally raw - this mode will print to stdout whatever was matched, without trying to resolve it into a full URL (this way you can use this software to extract pretty much anything).

The do actions can be chained in arbitrary manner and form a command path that will be executed linearly. You can define multiple targets in a single file. It will work with SSL connection however it will not do any verification.

This is by far the most involved piece of software that I have written in Golang, and although it's far from beauty, I had a lot of fun writing it.

Depends on gokogiri. Tested under Linux and FreeBSD.

kurier.go

Check the status of your deliveries from the command-line (and fast). This one uses either XPath or plain old Regexp extractor, so you can deal easily with both HTML and JSON. It also has support for custom download function for those idiotic services that think PHPSESSID is a security feature. Provide your tracking numbers as arguments. Depends on gokogiri and sanitize.

captcher.go

Prototype of a stand-alone image-based CAPTCHA generator and server. The two main goals are performance and simplicity, with intent to experiment with different forms of CAPTCHAs (as the mutilated-text images suck). The API consists of /gen which returns either a 500 error or a simple JSON response (token as int and answer as string), and /image?id=TOKEN. Right now it will generate random string images without any 'effects'. It has some command-line options such as how long should the images remain downloadable. Being a prototype it lacks proper performance testing and some additional security (like protection of /gen with a passkey or IP policy). The only dependency is freetype-go. There are no runtime dependencies. The sample.ttf font comes from Ruby/SDL repository (if I remember correctly) and I claim no copyright for it.

httpknock.go

Open firewall by issuing a get request. Rough but tested. Important config only in code. Not that you should trust binaries for such stuff anyway. Needs SSL before I'd actually recommend it.

License

Copyright (c) 2012 - 2017 Piotr S. Staszewski.

Usual 2-clause BSD license for everything. See LICENSE.txt for details.

About

Adventures in Golang.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages