Skip to content

hariharan-uno/extract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

extract GoDoc

extract is a simple library for extracting elements from a web page. It provides simple higher level functions based upon Cascadia and html packages.

For example,

extract all the links from a web page
package main

import (
	"fmt"
	"log"

	"github.com/hariharan-uno/extract"
)

func main() {
	l, err := extract.Links("http://google.com")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(l)
}
extract all the URLs of the images from a web page
package main

import (
	"fmt"
	"log"

	"github.com/hariharan-uno/extract"
)

func main() {
	i, err := extract.Images("http://google.com")
	if err != nil {
		log.Fatal(err)
	}
	fmt.Println(i)
}

Currently, only the functions extract.Links() and extract.Images() are supported. If you'd like a specific function to be supported, please file an issue.

Credits

Authors of Cascadia and html

About

A simple library for extracting elements from a web page

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages