Configuring Apache on CentOS 6 for Google PageSpeed Service 1


If you’re trying to set up an Apache web server on CentOS  as your origin server for Google PageSpeed service, you need to take certain steps to make sure that Apache can see (and also log) the IP address of the visiting host. Otherwise, all you’ll see in your Apache log files is a bunch of Google IP addresses, like this:

74.125.187.192 - - [22/Jun/2014:16:16:24 -0700] "GET /apple-touch-icon-precomposed.png HTTP/1.1" 200 5280
74.125.187.192 - - [22/Jun/2014:16:16:24 -0700] "GET /android/ HTTP/1.1" 200 3314
74.125.187.192 - - [22/Jun/2014:16:16:24 -0700] "GET /digimon-world-ds-nintendo-ds/ HTTP/1.1" 302 20
74.125.187.192 - - [22/Jun/2014:16:16:25 -0700] "GET /digimon-world-ds-nintendo-ds/ HTTP/1.1" 302 20
74.125.187.192 - - [22/Jun/2014:16:16:26 -0700] "GET /xbox-360/ HTTP/1.1" 200 3328
74.125.187.192 - - [22/Jun/2014:16:16:26 -0700] "GET /pokemon-heartgold-version-nintendo-ds/ HTTP/1.1" 200 20756
74.125.187.176 - - [22/Jun/2014:16:16:26 -0700] "GET /spongebob-squarepants-planktons-robotic-revenge-xbox-360/ HTTP/1.1" 302 20
74.125.187.153 - - [22/Jun/2014:16:16:26 -0700] "GET /spongebob-squarepants-planktons-robotic-revenge-xbox-360/ HTTP/1.1" 302 20
74.125.19.13 - - [22/Jun/2014:16:16:29 -0700] "GET /saints-row-2-xbox-360/ HTTP/1.1" 200 10175
74.125.187.192 - - [22/Jun/2014:16:16:30 -0700] "GET /saints-row-2-xbox-360/ HTTP/1.1" 200 10175
74.125.187.192 - - [22/Jun/2014:16:16:31 -0700] "GET /nintendo-ds/d/ HTTP/1.1" 302 20
74.125.187.192 - - [22/Jun/2014:16:16:31 -0700] "GET /android/ HTTP/1.1" 200 3314
74.125.187.192 - - [22/Jun/2014:16:16:31 -0700] "GET /android/ HTTP/1.1" 200 3314
74.125.187.192 - - [22/Jun/2014:16:16:33 -0700] "GET /disneys-meet-the-robinsons-ps2/ HTTP/1.1" 200 4175
74.125.187.192 - - [22/Jun/2014:16:16:34 -0700] "GET /android/d/ HTTP/1.1" 200 3179
74.125.187.192 - - [22/Jun/2014:16:16:35 -0700] "GET / HTTP/1.1" 200 3600
74.125.187.192 - - [22/Jun/2014:16:16:36 -0700] "GET /yu-gi-oh-gx-spirit-caller-nintendo-ds/ HTTP/1.1" 302 20

Google addresses this in their PageSpeed Service FAQ here: https://developers.google.com/speed/pagespeed/service/faq#clientip.

Introducing mod_remoteip

In order to allow your Apache server to log the correct incoming IPs, you need to install mod_remoteip on your server. As the official description of the package suggests, it replaces the original client IP address for the connection with the useragent IP address list presented by a proxies or a load balancer via the request headers.” Basically, it tells Apache what the actual remote IP address is. This module is native starting with Apache 2.4, so if you’re running Apache 2.4 or later, you don’t need to compile and install it, so just skip to the “Configuring mod_remoteip for Google PageSpeed” below. To find out which version of Apache you’re running, do:

apachectl -v

You should get something like:

Server version: Apache/2.2.15 (Unix)
Server built: Apr 3 2014 23:56:16

Compiling and Installing mod_remoteip on Apache 2.2

If you are running Apache 2.2, you can thank Takashi Takizawa for backporting mod_remoteip for Apache 2.2.X servers and posting it on his GitHub as mod_remoteip_httpd22.

First, make a new subdirectory in /usr/local/src to download the module, then jump to that directory:

mkdir /usr/local/src/mod_remoteip

cd /usr/local/src/mod_remoteip

Next, download the code, example config file, and Makefile with:

wget https://raw.githubusercontent.com/ttkzw/mod_remoteip-httpd22/master/mod_remoteip.c
wget https://raw.githubusercontent.com/ttkzw/mod_remoteip-httpd22/master/mod_remoteip.conf
wget https://raw.githubusercontent.com/ttkzw/mod_remoteip-httpd22/master/Makefile

Takashi’s install instructions tell you to use apxs to install the module, but you probably don’t have that on your server (it comes with the httpd-devel package). So make sure it’s installed by doing:

yum install httpd-devel

Now you’re ready to either run the apxs command line shown on Takashi’s GitHub, or just use the Makefile (which is what I suggest) by typing:

make

And then once that’s complete, type:

make install

The newly compiled module is now installed, and ready to be configured for Apache to use.

Configuring mod_remoteip for Google PageSpeed

Now that mod_remotip is installed, tell your Apache server how to use it by editing your httpd.conf file. Find a good location in the file to paste in the following five lines:

# Load and configure mod_remoteip for Google PageSpeed Service
LoadModule remoteip_module /usr/lib64/httpd/modules/mod_remoteip.so
RemoteIPHeader X-Forwarded-For
# Google PageSpeed Service, ref http://support.google.com/a/bin/answer.py?hl=en&answer=60764
RemoteIPInternalProxy 0.0.0.0/0 1.1.1.1/1 2.2.2.2/2

Note that the IP address ranges in that last line won’t actually work as written. It’s the example Google provides in their docs, but what’s actually supposed to be there is a long list of specific IP address ranges for Google’s servers. However, because they are subject to change, Google wants you to query for them to get a recent list. From a command line, they want you to do:

nslookup -q=TXT _spf.google.com 8.8.8.8

If you did this on June 24, 2014 (the day I wrote this post), you’d get the following in reply:

Non-authoritative answer:
_spf.google.com text = "v=spf1 include:_netblocks.google.com include:_netblocks2.google.com include:_netblocks3.google.com ~all"

That’s saying that all the possible IP address ranges are represented those “included” netblocks, so you now need to query each of them (one at a time) for the specific IP address ranges. Using the first one of _netblocks.google.com as an example (the leading underscore is important), you’d type:

nslookup -q=TXT _netblocks.google.com 8.8.8.8

and in response you’d get:

Non-authoritative answer:
_netblocks.google.com   text = "v=spf1 ip4:216.239.32.0/19 ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18 ip4:209.85.128.0/17 ip4:66.102.0.0/20 ip4:74.125.0.0/16 ip4:64.18.0.0/20 ip4:207.126.144.0/20 ip4:173.194.0.0/16 ~all"

Now Google wants you to include all those IP address ranges in that RemoteIPInternalProxy line in your httpd.conf, then query the next netblock, collect all those ranges, query the next one, get all those ranges, and so on.

But there’s a better way.

There’s a great little script gwhitelist, written by Mike Miller, which is designed to help Postfix admins create whitelists for Gmail servers, based on Google’s netblock. It uses similar DNS lookup commands (it actually uses dig instead of nslookup) as suggested by Google to query for the netblocks list, then queries all those for the IP ranges. Finally, it then spits out the ranges in a Postfix-friendly format.

On June 24, 2014, I got the following when I ran gwhitelist.sh:

216.239.32.0/19   permit
64.233.160.0/19   permit
66.249.80.0/20   permit
72.14.192.0/18   permit
209.85.128.0/17   permit
66.102.0.0/20   permit
74.125.0.0/16   permit
64.18.0.0/20   permit
207.126.144.0/20   permit
173.194.0.0/16   permit
2001:4860:4000::/36   permit
2404:6800:4000::/36   permit
2607:f8b0:4000::/36   permit
2800:3f0:4000::/36   permit
2a00:1450:4000::/36   permit
2c0f:fb50:4000::/36   permit

You could just copy and paste all those values (ignoring the permit), and you’re good to go. But I’m lazy, so I commented out one line of Mike’s script, and added a new line that formats the output in a single line, making it even easier to just copy and paste it into your httpd.conf file. I created a gist of the file and called it gwhitelist_pss.sh (and gave Mike credit in the slightly modified script). Feel free to download the “raw” version of that script, run it, and paste in the values.

Regardless of how you gather the IP ranges, collect them all (including the IPv6 ones at the end) and include them on a single line in your httpd.conf file to replace the 0.0.0.0/0 1.1.1.1/1 2.2.2.2/2 values in the example with the full list of current Google IP ranges.

Once you’re done editing your httpd.conf file, restart apache with:

apachectl restart

Check your logfile with:

tail -f /var/log/httpd/access_log

and you should start seeing non-Google IP addresses making requests on your server, like this:

204.236.213.209 - - [22/Jun/2014:16:17:04 -0700] "GET /nintendo-ds/y/ HTTP/1.1" 302 20
98.17.169.104 - - [22/Jun/2014:16:17:05 -0700] "GET /ps2/ HTTP/1.1" 200 3330
54.224.54.148 - - [22/Jun/2014:16:17:05 -0700] "GET /g-force-ps2/ HTTP/1.1" 302 20
173.58.23.114 - - [22/Jun/2014:16:17:05 -0700] "GET /grand-theft-auto-iii-android/ HTTP/1.1" 200 4540
173.209.212.204 - - [22/Jun/2014:16:17:06 -0700] "GET /xbox-360/r/ HTTP/1.1" 200 4614
50.97.84.118 - - [22/Jun/2014:16:17:06 -0700] "GET /the-chronicles-of-riddick-escape-from-butcher-bay-xbox/ HTTP/1.1" 302 20
66.249.67.199 - - [22/Jun/2014:16:17:07 -0700] "GET /bass-pro-shops-the-hunt-wii/ HTTP/1.1" 200 3426
208.123.79.68 - - [22/Jun/2014:16:17:07 -0700] "GET /duel-masters-kaijudo-showdown-gba/ HTTP/1.1" 302 20
66.249.67.225 - - [22/Jun/2014:16:17:08 -0700] "GET /the-chronicles-of-narnia-prince-caspian-ps2/ HTTP/1.1" 200 3713
66.249.67.225 - - [22/Jun/2014:16:17:08 -0700] "GET /squinkies-nintendo-ds/ HTTP/1.1" 200 3298
70.199.247.79 - - [22/Jun/2014:16:17:08 -0700] "GET /ps3/ HTTP/1.1" 200 3329
66.249.67.199 - - [22/Jun/2014:16:17:11 -0700] "GET /bionicle-maze-of-shadows-gba/ HTTP/1.1" 200 3501
70.11.61.199 - - [22/Jun/2014:16:17:12 -0700] "GET / HTTP/1.1" 200 3600
66.249.67.199 - - [22/Jun/2014:16:17:12 -0700] "GET /roblox-pc/ HTTP/1.1" 200 4913
107.201.244.173 - - [22/Jun/2014:16:17:12 -0700] "GET / HTTP/1.1" 200 3600
66.249.67.199 - - [22/Jun/2014:16:17:13 -0700] "GET /tony-hawks-pro-skater-dreamcast/ HTTP/1.1" 200 5153
66.249.67.199 - - [22/Jun/2014:16:17:13 -0700] "GET /guitar-hero-5-ps2/ HTTP/1.1" 200 4581

Congratulations! You’re running Google PageSpeed Service and still showing the proper remote IP address.

As always, I welcome your questions and comments below.