Fight spam with The Project Honey Pot. The interview.

CMS, Web Talk, WebmasterComments Off on Fight spam with The Project Honey Pot. The interview.

Bookmark and Share

php_logoOne of the most compelling problems for bloggers and webmasters is, without any doubt, malware and the most effective way to fight spam. Every day blogs and websites are literally bombed by dozens of comments and emails containing the most heterogeneous topics. A lot of them, alas, have to do with spam. Blue pills, miraculous treatments, links to porn websites and doubtful deals are only a few example of the contents website’s owners have to deal with.  Besides all this, there are other threatening activities which, despite they do not produce any obvious result as the ones described above, in the long term can harm websites, webmasters and readers. I am talking about spam bots, spam crawlers, spam harvester which silently scan websites looking for emails, bugs or security flaws in order to send rubbish e-mails  or inject spam codes in the website’s pages. Some months ago, while surfing the web I came across an interesting organisation fighting spam in a very effective way. I am talking about The Project Honey Pot. They provide a code giving you the ability to install hidden traps inside your webpages. These traps are quite attractive to spam …hey, that’s why they are called “Honey Pots”. Even though I was a little skeptical, I decided to give it a try. Believe it or not, since then I have witnessed great lessening in spam hitting my blog, resulting in a better blog performance and an increase of time I can devote to my readers and posts I publish!  Here is my interview with Matthew Prince, a brilliant attorney and law professor coordinating the overall strategy at The Project Honey Pot. He was also one of  the co-founders so, who better than him can explain to you guys what The Project Honey Pot is all about and how it can help you fight spam? 

1) Hi Matthew! I would like you to explain to my readers what the Project Honey Pot is.

Project Honey Pot is a community effort where website administrators around the world can work together to track bad guys online. The Project tracks machines that are harvesting email addresses, posting comment spam, sending email spam, breaking robots.txt rules, and engaged in other malicious behavior. More than 60,000 website administrators in more than 140 countries have signed up to the project. Most have either installed a small script on their site which helps us track the bad guys, or, if they’re unable to install scripts on their site, they’ve included specially tagged, invisible links to other people’s honey pots. On any given day, through the efforts of the members of the Project, we’re tracking nearly 50 million computers that are doing bad things line.

2) Great piece of work indeed! What role do you play in the Project? (from now on I will refer to Project Honey Pot as PHP)

I was one of the co-founders on the Project Honey Pot team. They don’t let me write much code anymore as our needs have scaled beyond my technical abilities, but I still help coordinate the overall strategy and map a course for where we’re going. I’m also an attorney and law professor so I help interact with law enforcement authorities with whom we share our data in order to help stop online criminals.

3) In what ways do you think PHP is going to be useful to bloggers and Webmasters?

There are two main ways: one which is direct, one which is indirect. The direct benefit to web admins is that we provide access to the Project Honey Pot data free to all the members of the community through something called http:BL. That service allows anyone with a website to quickly query the PHP system and determine if a visitor is a threat. It’s up to the individual web admin what to do with that information, but PHP is empowering websites to make intelligent choices about who they let on, and who they keep away.Indirectly, being part of the PHP community is an active step that web admins can take to do something about online criminals. There are a lot of people who are engaged in bad behavior on the Internet. Unfortunately, that’s making it a worse place for the good guys to be. When we’ve talked to people running websites they complain that the bad guys often make what would otherwise be a fun occupation or hobby a real pain. There’s no way that an individual site can do much about these attackers, but if we can pool our resources and data together then we’ll have the information necessary to go after them.

4) A lot of my readers have got a website or a CMS. What platform does PHP support? For example, is it compatible with WordPress, Joomla!, Drupal etc.? Also, is it difficult to implement on a blog?

There are two parts to PHP: the trap and http:BL. The full version of the traps are simple scripts. We have versions that support most scripting platforms including: PHP (the language), Python, Perl, ASP, .NET, ColdFusion, etc. So, for example, if you’re running WordPress then that CMS runs on top of PHP (the language). A web admin can download the PHP-version of the honey pot script and follow the installation instructions. Generally, if you have the right to install executable scripts on your host, there will be a version of the trap scripts that you can run.If you’re running in a managed environment (e.g., Blogger, LiveJournal, etc.) then you are usually not allowed to install software and therefore cannot install one of our honey pot scripts. For these users, we provide what we call QuickLinks. These links point to honey pot scripts that other users have installed. The QuickLink you receive has a special code that allows us to track that you are the one who referred any bad guys off to the honey pot. In the statitics that we report, both you (as the QuickLinker) and the honey pot installer get credit for the catch. QuickLinks can be installed by anyone who can edit the HTML of a site, regardless of the CMS system the site runs on.

The other part of PHP is http:BL. While the traps are ways to get data into PHP, http:BL is a way for website admins to take advantage of that data. While PHP doesn’t author or maintain any http:BL implementations ourselves, several members have created versions for popular CMS and forum software (e.g., WordPress, vBulletin, etc.). You can search our site or the internet to find these implementations. If your particular CMS isn’t supported, we provide a very simple API that should allow virtually any platform to take advantage of PHP’s data.

5) Seems too good to be true…Mmm…is it free or do you guys have put some hidden fees we should know about?

Nope. We throw some Google ads up on the site to help pay the bills, and we have some Monitoring services for large networks that we charge modest fees for, but everything for the average web administrator is free. We rely on web admins to gather the data. Our philosophy is that if you’re willing to contribute data to us to make the Internet better, we want to make sure you can take advantage of that data for free as well.

6) Wow! that’s sound heaven to bloggers! Moving on, you know, I am curious and I like digging under the surface. How old is PHP and what was the trigger which pushed you to create it?

We first started building PHP back in 2004. The original idea was to try and track the whole spam email cycle. While a lot of people were looking at who was sending spam, that was really the last step in a very long chain. We were curious if we could see more of the chain by tracking the moment in time when email addresses were first harvested from websites. Over time, we’ve expanded PHP to track more and more malicious behaviors beyond email harvesting. While the backend technology was developed to be very robust, our website is looking a bit long in the tooth these days and really deserves an overhaul to better describe all the things we are doing.

7) Do you have any partnership or contact with important companies? What about Google, Yahoo etc.?

We work with a number of organizations in order to help crack down on Internet criminals. If you have read about an action against an Internet bad guy in the last several years, chances are, whether directly or indirectly, PHP data has played a role in helping bring the case. For a number of reasons, however, we tend to be pretty quiet about exactly who we’re working with.

8 ) Talking about Google, If a blogger places a PHP hidden link in his blog, is he going to be penalized by Google? Should Bloggers be concerned about SEO, SERP pagerank ect. when it comes to installing PHP?

That’s a good question and one we get occasionally. We’ve talked to the Google folks about what we’re doing. They’ve been really supportive. One of their senior ranking engineers assured us that there would be no punishment for including links to honey pots in the ways we recommend. Generally, Google and other search engines have punished the linked to pages, not the linkers. We actually work pretty hard to keep our honey pots out of the Google index so it’s fine if they don’t get high page ranks.

9) OK Matthew, You have been great in this interview! Can you tell us something about the future of PHP?

About 2 years ago we filed the biggest lawsuit in history against email spammers. While you can track a lot of things through technical means, like those used by PHP, in some cases you need to be able to subpoena records and use the courts. Over the last 2 years we’ve learned a ton about the spammers themselves as well as the businesses they use to ply their trade. In the next few weeks I expect that we’ll release a comprehensive white paper on our findings. My expectation is that the data we provide there will launch a number of actions by law enforcement and targeted companies to help take down a number a bunch of bad guys. Shortly after we release that, watch for news about another large legal action by PHP. It’s still in the works, but I suspect we may be reaching out beyond just email spammers this time.From a technical perspective, we keep beefing up our architecture to keep up with our growth. Receiving, indexing, and cataloging millions of spam messages each day has been a cool technical challenge and we’ve got some great engineers working on the project. We’ve got some great stuff in our development pipeline too, including better tracking of rule breaking bots (e.g., those that don’t follow robots.txt or other website rules), honey pots that look more like real website pages, better geoidentification of IPs we’re monitoring, ways for companies that are getting phished to get notified about the attacks, more information on the IPs that are hosting sites advertised in email and comment spam, and lots more. Hopefully, at some point before too long, we’ll also get time to update our web page so it looks a bit less 1990s and actually better describes what we’re doing.

10) Thanks Matthew for letting us know about Project Honey Pot! Have a great day and keep the good work up!

No problem, happy to help.

Here are more resources in order to understand  better the huge echo The Project Honey Pot is having all around the world, how you can help and become one of its memeber  and other interesting links about different spam traps to set in your website.

Countries participating in Project Honey Pot:

Daily stats:

http:BL API

Some http:BL implementations

Related Articles Latest Articles

Comments are closed.

Copyright © 2007-2017 | Sitemap | Privacy | Back To Top
Best screen resolution 1280x800 or higher.
Web Talk is best viewed in Firefox.