If you’re trying to set up an Apache web server on CentOS as your origin server for Google PageSpeed service, you need to take certain steps to make sure that Apache can see (and also log) the IP address of the visiting host. Otherwise, all you’ll see in your Apache log files is a bunch of Google IP addresses, like this:
74.125.187.192 - - [22/Jun/2014:16:16:24 -0700] "GET /apple-touch-icon-precomposed.png HTTP/1.1" 200 5280 74.125.187.192 - - [22/Jun/2014:16:16:24 -0700] "GET /android/ HTTP/1.1" 200 3314 74.125.187.192 - - [22/Jun/2014:16:16:24 -0700] "GET /digimon-world-ds-nintendo-ds/ HTTP/1.1" 302 20 74.125.187.192 - - [22/Jun/2014:16:16:25 -0700] "GET /digimon-world-ds-nintendo-ds/ HTTP/1.1" 302 20 74.125.187.192 - - [22/Jun/2014:16:16:26 -0700] "GET /xbox-360/ HTTP/1.1" 200 3328 74.125.187.192 - - [22/Jun/2014:16:16:26 -0700] "GET /pokemon-heartgold-version-nintendo-ds/ HTTP/1.1" 200 20756 74.125.187.176 - - [22/Jun/2014:16:16:26 -0700] "GET /spongebob-squarepants-planktons-robotic-revenge-xbox-360/ HTTP/1.1" 302 20 74.125.187.153 - - [22/Jun/2014:16:16:26 -0700] "GET /spongebob-squarepants-planktons-robotic-revenge-xbox-360/ HTTP/1.1" 302 20 74.125.19.13 - - [22/Jun/2014:16:16:29 -0700] "GET /saints-row-2-xbox-360/ HTTP/1.1" 200 10175 74.125.187.192 - - [22/Jun/2014:16:16:30 -0700] "GET /saints-row-2-xbox-360/ HTTP/1.1" 200 10175 74.125.187.192 - - [22/Jun/2014:16:16:31 -0700] "GET /nintendo-ds/d/ HTTP/1.1" 302 20 74.125.187.192 - - [22/Jun/2014:16:16:31 -0700] "GET /android/ HTTP/1.1" 200 3314 74.125.187.192 - - [22/Jun/2014:16:16:31 -0700] "GET /android/ HTTP/1.1" 200 3314 74.125.187.192 - - [22/Jun/2014:16:16:33 -0700] "GET /disneys-meet-the-robinsons-ps2/ HTTP/1.1" 200 4175 74.125.187.192 - - [22/Jun/2014:16:16:34 -0700] "GET /android/d/ HTTP/1.1" 200 3179 74.125.187.192 - - [22/Jun/2014:16:16:35 -0700] "GET / HTTP/1.1" 200 3600 74.125.187.192 - - [22/Jun/2014:16:16:36 -0700] "GET /yu-gi-oh-gx-spirit-caller-nintendo-ds/ HTTP/1.1" 302 20
Google addresses this in their PageSpeed Service FAQ here: https://developers.google.com/speed/pagespeed/service/faq#clientip.
Introducing mod_remoteip
In order to allow your Apache server to log the correct incoming IPs, you need to install mod_remoteip on your server. As the official description of the package suggests, it replaces the original client IP address for the connection with the useragent IP address list presented by a proxies or a load balancer via the request headers.” Basically, it tells Apache what the actual remote IP address is. This module is native starting with Apache 2.4, so if you’re running Apache 2.4 or later, you don’t need to compile and install it, so just skip to the “Configuring mod_remoteip for Google PageSpeed” below. To find out which version of Apache you’re running, do:
apachectl -v
You should get something like:
Server version: Apache/2.2.15 (Unix) Server built: Apr 3 2014 23:56:16
Compiling and Installing mod_remoteip on Apache 2.2
If you are running Apache 2.2, you can thank Takashi Takizawa for backporting mod_remoteip for Apache 2.2.X servers and posting it on his GitHub as mod_remoteip_httpd22.
First, make a new subdirectory in /usr/local/src to download the module, then jump to that directory:
mkdir /usr/local/src/mod_remoteip cd /usr/local/src/mod_remoteip
Next, download the code, example config file, and Makefile with:
wget https://raw.githubusercontent.com/ttkzw/mod_remoteip-httpd22/master/mod_remoteip.c wget https://raw.githubusercontent.com/ttkzw/mod_remoteip-httpd22/master/mod_remoteip.conf wget https://raw.githubusercontent.com/ttkzw/mod_remoteip-httpd22/master/Makefile
Takashi’s install instructions tell you to use apxs to install the module, but you probably don’t have that on your server (it comes with the httpd-devel package). So make sure it’s installed by doing:
yum install httpd-devel
Now you’re ready to either run the apxs command line shown on Takashi’s GitHub, or just use the Makefile (which is what I suggest) by typing:
make
And then once that’s complete, type:
make install
The newly compiled module is now installed, and ready to be configured for Apache to use.
Configuring mod_remoteip for Google PageSpeed
Now that mod_remotip is installed, tell your Apache server how to use it by editing your httpd.conf file. Find a good location in the file to paste in the following five lines:
# Load and configure mod_remoteip for Google PageSpeed Service LoadModule remoteip_module /usr/lib64/httpd/modules/mod_remoteip.so RemoteIPHeader X-Forwarded-For # Google PageSpeed Service, ref http://support.google.com/a/bin/answer.py?hl=en&answer=60764 RemoteIPInternalProxy 0.0.0.0/0 1.1.1.1/1 2.2.2.2/2
Note that the IP address ranges in that last line won’t actually work as written. It’s the example Google provides in their docs, but what’s actually supposed to be there is a long list of specific IP address ranges for Google’s servers. However, because they are subject to change, Google wants you to query for them to get a recent list. From a command line, they want you to do:
nslookup -q=TXT _spf.google.com 8.8.8.8
If you did this on June 24, 2014 (the day I wrote this post), you’d get the following in reply:
Non-authoritative answer: _spf.google.com text = "v=spf1 include:_netblocks.google.com include:_netblocks2.google.com include:_netblocks3.google.com ~all"
That’s saying that all the possible IP address ranges are represented those “included” netblocks, so you now need to query each of them (one at a time) for the specific IP address ranges. Using the first one of _netblocks.google.com as an example (the leading underscore is important), you’d type:
nslookup -q=TXT _netblocks.google.com 8.8.8.8
and in response you’d get:
Non-authoritative answer: _netblocks.google.com text = "v=spf1 ip4:216.239.32.0/19 ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18 ip4:209.85.128.0/17 ip4:66.102.0.0/20 ip4:74.125.0.0/16 ip4:64.18.0.0/20 ip4:207.126.144.0/20 ip4:173.194.0.0/16 ~all"
Now Google wants you to include all those IP address ranges in that RemoteIPInternalProxy line in your httpd.conf, then query the next netblock, collect all those ranges, query the next one, get all those ranges, and so on.
But there’s a better way.
There’s a great little script gwhitelist, written by Mike Miller, which is designed to help Postfix admins create whitelists for Gmail servers, based on Google’s netblock. It uses similar DNS lookup commands (it actually uses dig instead of nslookup) as suggested by Google to query for the netblocks list, then queries all those for the IP ranges. Finally, it then spits out the ranges in a Postfix-friendly format.
On June 24, 2014, I got the following when I ran gwhitelist.sh:
216.239.32.0/19 permit 64.233.160.0/19 permit 66.249.80.0/20 permit 72.14.192.0/18 permit 209.85.128.0/17 permit 66.102.0.0/20 permit 74.125.0.0/16 permit 64.18.0.0/20 permit 207.126.144.0/20 permit 173.194.0.0/16 permit 2001:4860:4000::/36 permit 2404:6800:4000::/36 permit 2607:f8b0:4000::/36 permit 2800:3f0:4000::/36 permit 2a00:1450:4000::/36 permit 2c0f:fb50:4000::/36 permit
You could just copy and paste all those values (ignoring the permit), and you’re good to go. But I’m lazy, so I commented out one line of Mike’s script, and added a new line that formats the output in a single line, making it even easier to just copy and paste it into your httpd.conf file. I created a gist of the file and called it gwhitelist_pss.sh (and gave Mike credit in the slightly modified script). Feel free to download the “raw” version of that script, run it, and paste in the values.
Regardless of how you gather the IP ranges, collect them all (including the IPv6 ones at the end) and include them on a single line in your httpd.conf file to replace the 0.0.0.0/0 1.1.1.1/1 2.2.2.2/2 values in the example with the full list of current Google IP ranges.
Once you’re done editing your httpd.conf file, restart apache with:
apachectl restart
Check your logfile with:
tail -f /var/log/httpd/access_log
and you should start seeing non-Google IP addresses making requests on your server, like this:
204.236.213.209 - - [22/Jun/2014:16:17:04 -0700] "GET /nintendo-ds/y/ HTTP/1.1" 302 20 98.17.169.104 - - [22/Jun/2014:16:17:05 -0700] "GET /ps2/ HTTP/1.1" 200 3330 54.224.54.148 - - [22/Jun/2014:16:17:05 -0700] "GET /g-force-ps2/ HTTP/1.1" 302 20 173.58.23.114 - - [22/Jun/2014:16:17:05 -0700] "GET /grand-theft-auto-iii-android/ HTTP/1.1" 200 4540 173.209.212.204 - - [22/Jun/2014:16:17:06 -0700] "GET /xbox-360/r/ HTTP/1.1" 200 4614 50.97.84.118 - - [22/Jun/2014:16:17:06 -0700] "GET /the-chronicles-of-riddick-escape-from-butcher-bay-xbox/ HTTP/1.1" 302 20 66.249.67.199 - - [22/Jun/2014:16:17:07 -0700] "GET /bass-pro-shops-the-hunt-wii/ HTTP/1.1" 200 3426 208.123.79.68 - - [22/Jun/2014:16:17:07 -0700] "GET /duel-masters-kaijudo-showdown-gba/ HTTP/1.1" 302 20 66.249.67.225 - - [22/Jun/2014:16:17:08 -0700] "GET /the-chronicles-of-narnia-prince-caspian-ps2/ HTTP/1.1" 200 3713 66.249.67.225 - - [22/Jun/2014:16:17:08 -0700] "GET /squinkies-nintendo-ds/ HTTP/1.1" 200 3298 70.199.247.79 - - [22/Jun/2014:16:17:08 -0700] "GET /ps3/ HTTP/1.1" 200 3329 66.249.67.199 - - [22/Jun/2014:16:17:11 -0700] "GET /bionicle-maze-of-shadows-gba/ HTTP/1.1" 200 3501 70.11.61.199 - - [22/Jun/2014:16:17:12 -0700] "GET / HTTP/1.1" 200 3600 66.249.67.199 - - [22/Jun/2014:16:17:12 -0700] "GET /roblox-pc/ HTTP/1.1" 200 4913 107.201.244.173 - - [22/Jun/2014:16:17:12 -0700] "GET / HTTP/1.1" 200 3600 66.249.67.199 - - [22/Jun/2014:16:17:13 -0700] "GET /tony-hawks-pro-skater-dreamcast/ HTTP/1.1" 200 5153 66.249.67.199 - - [22/Jun/2014:16:17:13 -0700] "GET /guitar-hero-5-ps2/ HTTP/1.1" 200 4581
Congratulations! You’re running Google PageSpeed Service and still showing the proper remote IP address.
As always, I welcome your questions and comments below.