Adam Fields (weblog)

This blog is largely deprecated, but is being preserved here for historical interest. Check out my index page at adamfields.com for more up to date info. My main trade is technology strategy, process/project management, and performance optimization consulting, with a focus on enterprise and open source CMS and related technologies. More information. I write periodic long pieces here, shorter stuff goes on twitter or app.net.

2/1/2006

Ruby script to fetch hosts file and turn it into a privoxy block list

Filed under: — adam @ 1:34 pm

There are plenty of servers out there that, if they just disappear from the internet, not much bad happens. They include known ad server, spam, and spyware sites. The fine folks at mvps.org maintain a good list, which is up to about 10,000 entries now. Since I couldn’t figure out how to get privoxy to honor the local hosts file when doing DNS lookups, I wrote a little ruby script to fetch that file, break it down, and output a privoxy block list.

I chose ruby, because I’ve been working with it lately, and I really really like it. I find it incredibly easy to write, read, and work with.

If you’re a ruby developer, improvements of all kinds are welcomed. Please feel free to comment and discuss ways I could have made this more ruby-ish. Also, I haven’t quite grokked what the right approach is for ruby error/exception handling. Opinions on where checks should go are welcomed. For example, the whole thing is wrapped in a conditional block of opening the file. Do I need to handle any exception conditions, or is that all just taken care of properly?

#!/usr/bin/ruby

require 'open-uri'

hosts = Array.new
header = 1

open('http://www.mvps.org/winhelp2002/hosts.txt') do |file|
  file.each_line() do |line|
    # skip if still in header
    header = 0 if line =~ /^#start/
      next if header == 1
    # skip comments
    next if line =~ /^\s*#/

    # add the hostname to an array
    hosts < < line.split[1] #(sorry, no space between << - wordpress keeps inserting one for some reason.)
  end

  # write the output file
  outfile = open('privoxy_user_actions.txt', "w")
  outfile.puts "{ +block }" + "\n"
  hosts.each do |host|
    outfile.puts host + "\n"
  end
  outfile.close
end

Tags: , , ,


3 Responses to “Ruby script to fetch hosts file and turn it into a privoxy block list”

  1. James Wetterau Says:

    My minor changes — not sure if this is the ruby way or not, and I too,
    would be curious to hear from more experienced Ruby developers what
    they think of what I’ve done here:

    require 'open-uri'
    
    hosts = Array.new
    
    begin
      open('http://www.mvps.org/winhelp2002/hosts.txt') do |file|
        # skip if still in header
        file.each_line() { |line| break if line =~ /^#start/ }
    
        # skip comments and add the hostname to an array
        file.each_line() { |line| hosts < < line.split[1] if line !~ /^\s*#/ }
      end
    rescue
      print "An error occurred reading data from the web: " + $! + "\n"
      exit 1
    end
    
    # write the output file
    begin
      outfile = open('privoxy_user_actions.txt', "w")
    rescue
      print "Cannot open file: " + $! + "\n"
      exit 2
    end
    outfile.puts "{ +block }\n"
    hosts.each { |host| outfile.puts host + "\n" }
    outfile.close
    
    
  2. adam Says:

    Since I/O iterators maintain their position when you break out of the initial block, how do you go back to the beginning of the file if you want to reset the each_line? Do you need to close and reopen the file?

  3. James Wetterau Says:

    file.rewind()

Powered by WordPress