Ruby script to fetch hosts file and turn it into a privoxy block list
There are plenty of servers out there that, if they just disappear from the internet, not much bad happens. They include known ad server, spam, and spyware sites. The fine folks at mvps.org maintain a good list, which is up to about 10,000 entries now. Since I couldn’t figure out how to get privoxy to honor the local hosts file when doing DNS lookups, I wrote a little ruby script to fetch that file, break it down, and output a privoxy block list.
I chose ruby, because I’ve been working with it lately, and I really really like it. I find it incredibly easy to write, read, and work with.
If you’re a ruby developer, improvements of all kinds are welcomed. Please feel free to comment and discuss ways I could have made this more ruby-ish. Also, I haven’t quite grokked what the right approach is for ruby error/exception handling. Opinions on where checks should go are welcomed. For example, the whole thing is wrapped in a conditional block of opening the file. Do I need to handle any exception conditions, or is that all just taken care of properly?
#!/usr/bin/ruby require 'open-uri' hosts = Array.new header = 1 open('http://www.mvps.org/winhelp2002/hosts.txt') do |file| file.each_line() do |line| # skip if still in header header = 0 if line =~ /^#start/ next if header == 1 # skip comments next if line =~ /^\s*#/ # add the hostname to an array hosts < < line.split[1] #(sorry, no space between << - wordpress keeps inserting one for some reason.) end # write the output file outfile = open('privoxy_user_actions.txt', "w") outfile.puts "{ +block }" + "\n" hosts.each do |host| outfile.puts host + "\n" end outfile.close end
February 1st, 2006 at 5:55 pm
My minor changes — not sure if this is the ruby way or not, and I too,
would be curious to hear from more experienced Ruby developers what
they think of what I’ve done here:
February 1st, 2006 at 6:46 pm
Since I/O iterators maintain their position when you break out of the initial block, how do you go back to the beginning of the file if you want to reset the each_line? Do you need to close and reopen the file?
February 1st, 2006 at 6:57 pm
file.rewind()