How to gather IP and User Agent info from web log with AWK?

How to gather IP and User Agent info from web log with AWK?

I have a log file, containing text like:

66.249.74.18 - - [21/Apr/2013:05:55:33 +0000] 200 "GET /1.jpg HTTP/1.1" 7691 "-" "Googlebot-Image/1.0" "-"
220.181.108.96 - - [21/Apr/2013:05:55:33 +0000] 200 "GET /1.html HTTP/1.1" 17722 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" "-"

I want to collect all the ip and user agent info to a file:

66.249.74.18 "Googlebot-Image/1.0"
220.181.108.96 "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"

How can I do it with awk?

I know awk '{print $1}' can list all ips and awk -F" '{print $6}' can list all User Agent, but I have no idea how to combine them into output.

awk -F' - |\"' '{print $1, $7}' temp1

output:

66.249.74.18 Googlebot-Image/1.0
220.181.108.96 Mozilla/5.0 (compatible;Baiduspider/2.0;+http://www.baidu.com/search/spider.html)

temp1 file:

66.249.74.18 - - [21/Apr/2013:05:55:33 +0000] 200 "GET /1.jpg HTTP/1.1" 7691 "-" "Googlebot-Image/1.0" "-"
220.181.108.96 - - [21/Apr/2013:05:55:33 +0000] 200 "GET /1.html HTTP/1.1" 17722 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"     "-"
awk '{print $1,$6}' FPAT='(^| )[0-9.]+|"[^"]*"'
  • define a field as
    • start with beginning of line or space
    • followed by [0-9.]+ or "[^"]*"
  • then print fields 1 and 6
.
.
.
.