Implementing a home proxy server with Squid - Linux Magazine

linuxpromagazine.com
  • No tags were found...

Implementing a home proxy server with Squid - Linux Magazine

KNOW-HOWSquid proxy serverImplementing a home proxy server with SquidSAFE HARBORA proxy server provides safer and more efficient surfing.Although commercial proxy solutions are available, all you reallyneed is Linux and an old PC in the attic.BY GEERT VAN PAMELIhave had a home network for severalyears. I started with a router usingWindows XP with ICS (Internet ConnectionSharing) and one multi-homedEthernet card. The main disadvantageswere instability, low performance, and atotal lack of security. Troubleshootingwas totally impossible. Firewall configurationwas at the mercy of inexperiencedusers, who clicked randomly at securitysettings as if they were playing Russianroulette.I finally turned to Linux and set up aniptables firewall on a Pentium II computeracting as a router. The firewall systemwould keep the attackers off my networkand log incoming and outgoingtraffic. Along with the iptables firewall, Ialso set up a Squid proxy server toimprove Internet performance, filter outunwanted popup ads, and block dangerousURLs.A Squid proxy server filters Web trafficand caches frequently accessed files. Aproxy server limits Internet bandwidthusage, speeds up Webaccess, and lets youfilter URLs. Centrallyblocking advertisementsand dangerousdownloads is costeffective and transparentfor the end user.Squid is a high performanceimplementationof a free Open-Source, full-featuredproxy caching server.Squid provides extensiveaccess controlsand integrates easily with an iptablesfirewall. In my case, the Squid proxyserver and the iptables firewall workedtogether to protect my network fromintruders and dangerous HTML. You’llfind many useful discussions of firewallsin books, magazines, and Websites. (See[1] and [2], for example.) The Squidproxy server, on the other hand, is not asTable 1: Recommended HardwareNecessary ComponentsSpecificsIntel Pentium II CPU, or higher -Why not a spare Alpha Server?350 MHz80 - 100 MB memory minimum more is better1 or more IDE disks (reuse 2 old disks: 1 GBsystem SW + swap & 3 GB for cache + /home disk) 4 GB minimum2 Ethernet cards, minihub, fast Ethernet modem, 100 Mbit/ s ifwireless router or hubpossibleCDROM, DVD readersoftware ismostly distributedvia DVDUse only normal straight LAN cables [no need for modem andcross cables]minihub crossthemselves!48 ISSUE 60 NOVEMBER 2005 W W W. L I N U X- M A G A Z I N E . C O M


Squid proxy serverKNOW-HOWwell documented, especially for smallhome networks like mine. In this article,I will show you how to set up Squid.Getting StartedThe first step is to find the necessaryhardware. Figure 1 depicts the networkconfiguration of the Pentium II computerI used as a firewall and proxy server.This firewall system should operate withminimal human intervention, so afterthe system is configured, you’ll want todisconnect the mouse, keyboard, andvideo screen. You may need to adjust theBIOS settings so that the computer willboot without a keyboard. The goal is tobe able to put the whole system in theattic, where you won’t hear it or tripover it. From the minihub shown in Figure1, you can come “downstairs” to thehome network using standard UTP cableor a wireless connection. Table 1 showsrecommended hardware for the firewallmachine.Assuming your firewall is working,the next step is to set up Squid. Squid isavailable from the Internet at [3] or oneof its mirrors [4] as tar.gz (compile fromsources). You can easily install it usingone of the following commands:rpm -i /cdrom/RedHat/RPMS/Usquid-2.4.STABLE7-4.i386.rpmU# Red Hat 8rpm -i /cdrom/Fedora/RPMS/Usquid-2.5.STABLE6-3.i386.rpm U# Fedora Core 3rpm -i /cdrom/.../Usquid-2.5.STABLE6-6.i586.rpmU# SuSE 9.2At this writing, the current stable Squidversion is 2.5.Configuring SquidOnce Squid is installed, you’ll needto configure it. Squid has one centralconfiguration file. Every time this filechanges, the configuration must bereloaded with the command /sbin/ init.d/ squid reload.You can edit the configuration file witha text editor. You’ll find a detaileddescription of the settings inside thesquid.conf file, although the discussionis sometimes very technical and difficultto understand. This section summarizessome of the important settings in thesquid.conf file.First of all, you can prevent certainmetadata related to your configurationfrom reaching the external world whenyou surf the Web:vi /etc/squid/squid.conf...anonymize_headers deny UFrom Server Via User-Agentforwarded_for offstrip_query_terms onNote that you cannot anonymize Refererand WWW-Authenticate because otherwiseauthentication and access controlmechanisms won’t work.forwarded_for off means that the IPaddress of the proxy server will not besent externally.With strip_query_terms on, you do notlog URL parameters after the ?. Whenthis parameter is set to off, the full URLis logged in the Squid log files. This featurecan help with debugging the Squidfilters, but it can also violate privacyrules.The next settings identify the Squidhost, the (internal) domain where themachine is operating, and the usernameof whoever is responsible for the server.Note the dot in front of the domain. Furtheron, you find the name of the localDNS caching server, and the number ofdomain names to cache into the Squidserver.visible_hostname squidappend_domain .mshome.netInternetFigure 1: Ethernet basic LAN configuration.cache_mgr sysmandns_nameservers 192.168.0.1dns_testnames router.mshome.netfqdncache_size 1024http_port 80icp_port 0http_port is the port used by the proxyserver. You can choose anything, as longas the configuration does not conflictwith other ports on your router. A commonchoice is 8080 or 80. The Squiddefault, 3128, is difficult to remember.We are not using cp_port, so we set itto 0. This setting synchronizes proxyservers.With log_mime_hdrs on, you canmake mime headers visible in the access.log file.Avoid Disk ContentionSquid needs to store its cache somewhereon the hard disk. The cache is atree of directories. With the cache_diroption in the squid.conf file, you canspecify configuration settings such as thefollowing:• disk I/ O mechanism – aufs• location of the squid cache on the disk– /var/ cache/ squid• amount of disk space that can be usedby the proxy server – 2.5 GB• number of main directories – 16• subdirectories – 256For instance:cache_dir aufs U/var/cache/squid 2500 16 256Local NetworkW W W. L I N U X- M A G A Z I N E . C O MISSUE 60 NOVEMBER 200549


KNOW-HOWSquid proxy serverThe disk access method options are asfollows:• ufs – classic disk access (too much I/ Ocan slow down the Squid server)• aufs – asynchronous UFS with threads,less risk of disk contention• diskd – diskd daemon, avoiding diskcontention but using more memoryUFS is the classic UNIX file system I/ O.We recommend using aufs to avoid I/ Obottlenecks. (When you use aufs, youhave fewer processes.)# ls -ld /var/cache/squidlrwxrwxrwx 1 root rootU19 Nov 22 00:42 U/var/cache/squid -> U/volset/cache/squidI suggest you keep the standard file locationfor the squid cache /var/ cache/squid, then create a symbolic link to thereal cache directory. If you move thecache to another disk for performance orcapacity reasons, you only have to modifythe symbolic link.The disk space is distributed amongall directories. You would normally lookfor even distribution across all directories,but in practice, some variation inthe distribution is acceptable. More complexsetups using multiple disks are possible,but for home use, one directorystructure is sufficient.maximum_object_sizeU_in_memory 2048 KBLog Format SpecificationYou can choose between Squid log formatand standard web server log formatusing the parameter emulate_httpd_log.When the parameter is set to on, standardweb log format is used; if theparameter is set to off, you get moredetails with the Squid format. See [7] formore on analyzing Squid log files.Proxy HierarchyThe Squid proxy can work in a hierarchicalway. If you want to avoid the parentproxy for some destinations, you canallow a direct lookup. The browser willstill use your local proxy!acl direct-domain Udstdomain .turboline.bealways_direct allow Udirect-domainacl direct-path urlpath_regexU-i "/etc/squid/direct-path.reg"always_direct allow direct-pathSome ISPs allow you to use their proxyserver to visit their own pages even ifyou are not a customer. This can helpyou speed up your visits to their pages.The closer the proxy to the originalTable 2: ACL Guidelines• the order of the rules is important• first list all the deny rules• the first matching rule is executed• the rest of the rules are ignored• the last rule should be an allow allpages, the more likely the page is to becached. Because your own ISP is moreremote, the ISP is less likely to be cachingits competitor’s contents…cache_peer proxy.tiscali.beUparent 3128 3130 Uno-query defaultcache_peer_domain Uproxy.tiscali.be .tiscali.beno-query means that you do not use, orcannot use, ICP (the Internet CachingProtocol), see [8]. You can obtain thesame functionality using regular expressions,but this gives you more freedom.cache_peer proxy.tiscali.beUparent 3128 3130 Uno-query defaultacl tiscali-proxy Udstdom_regex -i U\.tiscali\.be$cache_peer_access Uproxy.tiscali.be allow Utiscali-proxyCache ReplacementThe proxy server uses an LRU (LeastRecently Used) algorithm. Detailed studiesby HP Laboratories [6] have revealedthat an LRU algorithm is not always anintelligent choice. The GDSF settingkeeps small popular objects in cache,while removing bigger and lesser usedobjects, thus increasing the overall efficiency.cache_replacement_policyUheap GDSFmemory_replacement_policyUheap GDSFBig objects requested only once canflush out a lot of smaller objects, thereforeyou’d better limit the maximumobject size for the cache:cache_mem 20 MBmaximum_object_sizeU16384 KBListing 1: Blocking Unwanted Pages01 acl block-ip dst "/etc/squid/block-ip.reg"02 deny_info filter_spam block-ip03 http_access deny block-ip0405 acl block-hosts dstdom_regex -i "/etc/squid/block-hosts.reg"06 deny_info filter_spam block-hosts07 http_access deny block-hosts0809 acl noblock-url url_regex -i "/etc/squid/noblock-url.reg"10 http_access allow noblock-url Safe_ports1112 acl block-path urlpath_regex -i "/etc/squid/block-path.reg"13 deny_info filter_spam block-path14 http_access deny block-path1516 acl block-url url_regex -i "/etc/squid/block-url.reg"17 deny_info filter_spam block-url18 http_access deny block-url50 ISSUE 60 NOVEMBER 2005 W W W. L I N U X- M A G A Z I N E . C O M


Squid proxy serverKNOW-HOW01 vi /etc/squid/errors/filter_spam02 ...Listing 2: Making a PageInvisible03 04 07 08


KNOW-HOWSquid proxy serverListing 4: Blocking by Path or Extension01 vi /etc/squid/block-path.reg02 ...03 \.ad[ep](\?.*)?$04 \.ba[st](\?.*)?$05 \.chm(\?.*)?$06 \.cmd(\?.*)?$07 \.com(\?.*)?$08 \.cpl(\?.*)?$09 \.crt(\?.*)?$10 \.dbx(\?.*)?$11 \.hlp(\?.*)?$12 \.hta(\?.*)?$13 \.in[fs](\?.*)?$14 \.isp(\?.*)?$15 \.lnk(\?.*)?$16 \.md[abetwz](\?.*)?17 \.ms[cpt](\?.*)?$18 \.nch(\?.*)?$19 \.ops(\?.*)?$20 \.pcd(\?.*)?$21 \.p[ir]f(\?.*)?$22 \.reg(\?.*)?$23 \.sc[frt](\?.*)?$24 \.sh[bs](\?.*)?$25 \.url(\?.*)?$26 \.vb([e])?(\?.*)?$27 \.vir(\?.*)?$28 \.wm[sz](\?.*)?$29 \.ws[cfh](\?.*)?$INFO[1] Presentation for the HP-Interex usergroup in Belgium on 17/ 03/ 2005 about“Implementing a home Router, Firewall,Proxy server, and DNS CachingServer using Linux” http:// users.belgacombusiness. net/ linuxug/ pub/router/ linux-router-firewall-proxy. zip[2] Firewalls: http:// www. linux-magazine.com/ issue/ 40/ Checkpoint_FW1_Firewall_Builder.pdf http:// www.linux-magazine. com/ issue/ 34/IPtables_Firewalling. pdf[3] About Squid in general: http:// www.squid-cache. org http:// squid-docs.sourceforge. net/ latest/ book-full.html#AEN1685 http:// www.squid-cache. org/ FAQ/ FAQ-10. html[4] Squid mirror sites: http:// www1. de.squid-cache. org http:// www1. fr.squid-cache. org http:// www1. nl.squid-cache. org http:// www1. uk.squid-cache. org[5] Suse 9.2 Professional – DVD softwaredistribution http:// www. linux-magazine.com/ issue/ 54/ Linux_Magazine_DVD. pdf[6] For more information about the GDSFand LFUDA cache replacement policiessee: http:// www. hpl. hp. com/techreports/ 1999/ HPL-1999-69. htmlhttp:// fog. hpl. external. hp. com/techreports/ 98/ HPL-98-173. html[7] Reporting and analysing Squid logfiles: http:// www. linux-magazine. com/issue/ 36/ Charly_Column. pdf[8] ICP – Internet Caching Protocol: http://en. wikipedia. org/ wiki/ Internet_Cache_Protocol[9] The whois database: http:// www. ripe.net/ db/ other-whois. html[10] About Regular Expressions: http://www. python. org/ doc/ current/ lib/module-re. html[11] Example configuration files forSquid: http:// members. lycos. nl/geertivp/ pub/ squidtimes executable zip files that installsoftware. Squid lets you block files bypath, filename, or file extension, asshown in Listing 4.Squid also lets you filter for regularexpressions used in the URL.Of course, your filter may occasionallyturn up a false positive. You can add regularexpressions for URLs you specificallydon’t want to block to /etc/ squid/noblock-url.reg.vi /etc/squid/noblock-url.reg...^http://ads\.com\.com/You can find an up-to-date version ofthose configuration files at [11]Protect your PortsFor security reasons, you should disableall ports and only allow well known webports using the syntax shown in Listing 5.The same can be done for connectedports. You can allow SSL ports whenconnected, and deny them otherwise.Remember that the normal HTTP protocolis not connected. The client and thebrowser always establish a new connectionfor every page visit.01 acl Safe_ports port 80 # http02 acl Safe_ports port 21 # ftp03 acl Safe_ports port 2020 # BeOne Radio04 acl Safe_ports port 2002 # Local server05 acl Safe_ports port 8044 # Tiscaliacl SSL_ports port 443 563acl SSL_ports port 1863 U# Microsoft Messengeracl SSL_ports port 6346-6353 U# Limewirehttp_access allow UCONNECT SSL_portshttp_access deny UCONNECT06 acl Safe_ports port 8080 # Turboline port scan07 acl Safe_ports port 8081 # Prentice Hall0809 # Deny requests to unknown ports10 http_access deny !Safe_portsDo not allow others to misuse yourcache! You only want your cache to beused by your own intranet. Users on theexternal Internet should not be ableto access your cache:acl localhost src U127.0.0.1/255.255.255.255acl localnet-src src U192.168.0.0/24Listing 5: Protecting Portshttp_access deny !localnet-srcAllowing All the RestTo allow only the protocols and themethods that you want:acl allow-proto proto HTTPhttp_access deny !allow-proto52 ISSUE 60 NOVEMBER 2005 W W W. L I N U X- M A G A Z I N E . C O M

More magazines by this user
Similar magazines