Scraping With Golang And Proxy Authentication

Why use Gloang for Web Scraping?

Golang provides one of the fastest frameworks for scraping web content.

Go offers a wide selection of frameworks. Some are simple packages with core functionality, while others, such as FerretGocrawlSoup, and Hakrawler, offer a advanced web scraping architecture to simplify data extraction.

The most popular framework for writing web scrapers in Go is Colly.

With Colly you can easily extract structured data from websites, which can be used for a wide range of applications, like data mining, data processing, web site application testing or archiving.

Colly Features:

  • Clean API
  • Fast (>1k request/sec on a single core)
  • Manages request delays and maximum concurrency per domain
  • Automatic cookie and session handling
  • Sync/async/parallel scraping
  • Distributed scraping
  • Caching
  • Automatic encoding of non-unicode responses
  • Robots.txt support
  • Google App Engine support

Colly has a clean API, handles cookies and sessions automatically, supports caching and robots.txt, and most importantly, it’s fast. Colly offers distributed scraping, HTTP request delays, and concurrency per domain.

Golang code is cross-platform and runs remarkably fast. Example scraping task with colly ran in less than 12 seconds. Executing the same task in Scrapy, which is one of the most optimized modern frameworks for Python, took about 20 seconds. If speed is what you prioritize for your web scraping tasks, it’s a good idea to consider Golang in tandem with a modern framework such as Colly.

Setting Up Proxy Authentication for Golang Colly

HTTP configuration

Colly uses Golang’s default http client as networking layer. HTTP options can be tweaked by changing the default HTTP roundtripper.

c := colly.NewCollector()
c.WithTransport(&http.Transport{
	Proxy: http.ProxyFromEnvironment,
	DialContext: (&net.Dialer{
		Timeout:   30 * time.Second,
		KeepAlive: 30 * time.Second,
		DualStack: true,
	}).DialContext,
	MaxIdleConns:          100,
	IdleConnTimeout:       90 * time.Second,
	TLSHandshakeTimeout:   10 * time.Second,
	ExpectContinueTimeout: 1 * time.Second,
}

Setting proxy authentication code example:

roundRobinSwitcher, err := collyProxy.RoundRobinProxySwitcher("socks5://username:password@127.0.0.1:9000")

And the HTTP package will add auth header for you automatically:
(https://github.com/golang/go/blob/master/src/net/http/transport.go#L1624)

case cm.proxyURL.Scheme == "socks5":
		conn := pconn.conn
		d := socksNewDialer("tcp", conn.RemoteAddr().String())
		if u := cm.proxyURL.User; u != nil {
			auth := &socksUsernamePassword{
				Username: u.Username(),
			}
			auth.Password, _ = u.Password()
			d.AuthMethods = []socksAuthMethod{
				socksAuthMethodNotRequired,
				socksAuthMethodUsernamePassword,
			}
			d.Authenticate = auth.Authenticate
		}
		if _, err := d.DialWithConn(ctx, conn, "tcp", cm.targetAddr); err != nil {
			conn.Close()
			return nil, err
		}

This method can be used if you want to force your user to specify their proxy server address and port number, and type their username and password into your application, and your application to store it somewhere (either in the clear or using necessarily reversible encryption)

Get Full Access To All Of Our Residential Proxies.

Residential Plans
Mobile Plans

Starter

3GB
$45 /month
  • $15 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Email support

Hobby

15GB
$150 /month
  • $10 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Email support

Startup

40GB
$300 /month
  • $7.5 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Live chat support

Business

100GB
$600 /month
  • $6 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Live chat support

Company

250GB
$1250 /month
  • $5 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Account Manager

Enterprise

1TB
$4000 /month
  • $4 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Account Manager

Mobile A

2GB
$60 /month
  • $30 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Email support

Mobile S

5GB
$150 /month
  • $30 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Email support

Mobile M

12GB
$300 /month
  • $25 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Live chat support

Mobile L

30GB
$660 /month
  • $22 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Live chat support

Mobile XL

85GB
$1530 /month
  • $18 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Account Manager

Enterprise M

300GB
$4500 /month
  • $15 per GB
  • Rollover bandwidth
  • 150+ Countries
  • granular targeting
  • Account Manager

See How Clients Are Using Residential Proxies.

Influencer Marketing

Digital stores and brands are using our residential and mobile proxies to extract valuable influencers across social platforms to increase sales.

Ad Verification

Advertisers use ProxyEmpire's network to sniff out potential ad fraud. They use granular targeting to see their ads on all devices and locations.

Sneaker Proxies

Catch the latest sneaker drops and cop them quick with fast residential proxies that go undetected by all sneaker brand sites. 

SEO Monitoring

Check your SERP performance with GEO-specific targeting and spy on the competition without them knowing using your own crawler.

ProxyEmpire
Works With All Of Your Favorite Tools.

Learning Center