Interview: Michelle Sullivan, Founder of SORBS – Part I

We’re pleased to introduce our latest uTest interview – Michelle Sullivan with SORBS (AKA: the Spam and Open Relay Blocking System). Michelle founded SORBS in 2003, and continues to run SORBS as the director of engineering after the service was acquired in 2009 by GFI Software.

In Part I of our interview, we talk about what goes into running a global anti-spam blacklist, SORBS’s acquisition by GFI, and their recent bug that accidentally marked millions of legitimate emails as spam (which we covered).

Check back tomorrow for Part II.

uTest: Your problems this month caught a lot of attention (including ours). What typically goes into planning and QA behind the scenes at SORBS?

MS: Within SORBS, and before GFI, I always wrote and deployed code. I set up a number of systems for development and testing, and as such we had a development database, a development webserver, an alpha test webserver, and a beta test system the latter two of which are connected to the production databases.

Normally every new function and feature is tested on the alpha test server before deploying to the beta site for ‘external testing’ and finally it would be deployed to the production servers. Changing between SORBS1 and SORBS v2.0 we had no production environment for SORBS v2.0 and we therefore had to rely on a much smaller userbase than normal. Testing still took place by a number of people including myself but with the reduced number of eyeballs some of the finer issues don’t necessarily get picked up.

A good example was the failure to set the ‘SSL cookie’ with the correct expiry time. The bug was a simple one, a ‘*’ had been used where a ‘+’ should have been used. Expiry of the cookie was therefore set to * 60 * 60 * 8760 which results in a huge number, and as time on a 32bit unix system is only 32bits of data rolled over several times. The net result was that during testing the cookie was set to a realistic value (by pure chance) but when launched to production enough seconds had passed to cause the rollover to land sometime during 1966 and therefore the cookie expired as soon as it was set (the correct code is: +60*60* 8760 which means ‘1 year from now’).

uTest: What new lessons did you learn from the problem this month?

MS: Well the biggest issue when we came to everyone’s attention was related to the accidental switching of parameters in the DB insertion code. The problem wasn’t picked up in testing as the main code base had been tested for months if not years before. When the new ‘history’ field was added, a number of tests were run to check that it was correct and had the desired effect, but due to the values used in the test the switch was not apparent (when inserting, if you use 2s and 1s in both columns, the switch is unseen) however during migration of the database the migration tool had been tested thoroughly against the development database using the ‘pre-history’ code and therefore wasn’t re-tested. The lesson learned is if you change a core feature, you need to test everything again – even if it was working and should still be working because your other tests show it all fine.

uTest: What made you decide to create SORBS? How did you go about building it?

MS: Back in 2000, I used to work for Netscape Communications and I found that having the email address <firstname>@netscape.com resulted in me receiving over 900 emails every morning. This was a huge time waste for me as it would take between two and three hours to sort through. When I moved to the University of Queensland towards the end of 2001, I found the problem was quite widespread. At the time there was the ‘monkeys.com’ blocklist created by Ronald F. Guilmette that listed open-proxies. I noted that the list was very often incomplete and suffered problems keeping up with the new proxies. I decided to create a ‘proxy and relay tester’, which tested for open proxies ‘on the fly’ (i.e., a connecting machine would be tested for all common proxy types upon trying to connect/deliver email to the University’s servers). Within two months of running this application, I had collected a list of 78,000 new proxies and decided to approach the IT Director, Dr N. Tate about publishing the list. On January 6, 2003 I launched the SORBS website which was a small piece of Perl on an Apache server with a MySQL and BIND backend. Within two weeks it was unusable as it was too popular. I upgraded the machine, and added another before changing to the ever-popular rbldnsd DNS server.  Before the end of 2004 we had located and listed some three million proxies and were one of the most popular DNSbl’s on the Internet.

At the end of 2003 and during 2004, with a friend (Christopher Burke) who is a genius with databases and understanding data, I set about redesigning the whole system, and towards the end of 2004 I started the long re-write.

SORBS2, as it was codenamed (there’s an original name!), was a complete rewrite of SORBS to a fully modular and relational system where networks, URLs, hosts, DNS systems, etc., were all recorded and inter-related so that we could identify spammers and the botnets they used.

SORBS2 started collecting data during 2005 but from late 2005 until 2008/2009 I shelved SORBS2 due to lack of personal time.

uTest: You were acquired by GFI Software a year ago. What was that experience like and how has your last year been?

MS: It’s been quite a roller-coaster ride in many ways. GFI never had a blocklist of their own and as such it was a sharp learning curve. SORBS was running on what we call ‘SORBS1’ code base, which included some of the original code from 2002, and it required constant maintenance of the database and of the lists. The database itself consisted of some 17 unrelated tables that acted more as a way of fast indexing than a real database. SORBS2 on the other hand is a completely new system designed to allow data mining of networks and abuse. My job at GFI has been to finish the SORBS2 codebase and deploy it to the world.

GFI have given me time to get the codebase out and have begun to plough resources into it. The result is we are now live with SORBS v2.0 and whilst there are some cosmetic changes to the SORBS2 website, endusers receive very similar experiences in the data it provides. The real beauty of deploying SORBS v2.0 is we can now start the real work that SORBS v2.0 was intended for, and indeed things like statistical analysis of spam, origins and destinations have already begun.

uTest: What does GFI mean for the future of SORBS?

MS: The biggest part of SORBS future is that GFI is now providing the necessary resources and therefore we can get the work, the improvements and the expansion done. It’s no longer all related to my pocket size and the generosity of the sponsors.

SORBS v2.1 is being worked on as I write and we are looking at roping in additional staff to work on the development of SORBS. There will be many new features over the next few months that address just about all the concerns that people have expressed over the years. The biggest and most notable is that ISPs can register and log in to the SORBS servers and access the database directly, for delisting of compromised hosts, spamming hosts, and changing the Usage listings (eg the SORBS DUHL) directly and without the need to wait for SORBS’ support staff.

On the non-technical side, SORBS has historically been run by volunteers which has from time to time infuriated listed parties as support has been provided as and when time is available. In the past this has been quite prompt at times, and also quite delayed at others. GFI has employed technical staff to look after support requests full-time as well as utilising the volunteer base – and the results are already visible. At the beginning of August 2010, the ‘Spam Database’ support request queue was trailing at over one month before people would receive a response; as of Friday October 15, 2010 we were responding to support requests just 48 hours old. In the next couple of weeks, we hope to get that down to just minutes.

The purchase of SORBS has guaranteed its future; GFI are taking and promoting the industry best practices of ‘Defense in Depth’ and as such are tackling the spam, malware, virus and botnet problems with a variety of products of which SORBS is just a part. The beauty of the GFI synergy is that SORBS is rapidly becoming a communications hub for what’s out there in the real world and what is found in the research labs. Data is flowing between departments and new problems are being located and mitigated to GFI and SORBS customers before they are even reported.

uTest: Tell us a little bit about what goes into a system like SORBS?

MS: This is a bit difficult to answer. At times I have had to work 18 hours a day, and for the last five years I’ve pretty much worked seven days a week on it.

What most people don’t realise about running an operation like SORBS is the amount of power you have, and with that power comes even greater responsibility. One small mistake and millions of people can be affected. At the same time there are people constantly attacking you, trying to trip you up, and in some cases just trying to destroy your reputation.

A good example of this is the transparency that SORBS works to. While some say SORBS is not transparent, the reality is we are more open and transparent than most of the blocklists out there, particularly the larger ones.

The Support Ticket system for SORBS utilises the popular ‘Request Tracker’ software by Best Practical. We make a point of denying database delisting requests unless there is an appropriate ticket in the database requesting (and justifying) the change. Requestors (the person lodging the Support Request) can log into the SORBS Support system and view all previous issues and the conversations they had with our support staff. At any point should we be compelled, we can show changes, why they happened and the full conversation, times of such and actions relating to any issue with any member of SORBS staff or a requestor.

The database itself has full auditing of changes and retains historical information to verify, validate and show any modifications, for both listing and delisting. The new SORBS v2.0 database even links the systems to provide faster access to the information.

The other part of the transparency is the policies; SORBS states quite clearly at a particular date and time, something happened. We might have received spam from a particular host or IP address, we might have detected an open relay or open proxy. In each case we provide as much information as possible to indicate the cause of the entry and how people may resolve the issue. Most other blocklists don’t provide the information, resorting to ‘we received spam from it’ or ‘we think there might be a proxy on the host’ or even ‘this network is operated by a known spammer’ – the latter case usually provides information about the listing but rarely (if ever) shows the reason why the person is a ‘known spammer.’ This doesn’t mean, however, that the information is wrong – just that there is a lower amount of transparency than SORBS operates with.

Check back tomorrow for Part II for more about the tech and people that go into running SORBS, plus we get Michelle’s thoughts on the future of spam blocking and the controversial “SORBS fine.

9 Responses to “Interview: Michelle Sullivan, Founder of SORBS – Part I”

  1. Interview: Michelle Sullivan, Founder of SORBS – Part II | Software Testing Blog said:

    [...] In Part II of our interview with Michelle Sullivan we discuss the future of spam blocking; whether or not blacklists or algorithms are still the best spam prevention method; the dreaded SORBS fine; and how Michelle would change email if she could. If you missed Part I of the conversation, you can find it here. [...]

  2. Your Emails Not Working? It’s Not Just You – #sorbs | Software Testing Blog said:

    [...] 3: Check out our full interview with Michelle Sullivan about SORBS, blacklisting, the tech running behind the scenes, and their [...]

  3. Randolf Richardson said:

    I have a great deal of respect for what Michelle has accomplished with SORBS. This was an informative interview.

  4. Chris Smith said:

    Maybe you should have a part 3 and ask Michelle what her thoughts on blacklisting entire /13 /12 subnets via duhl in the last 3 days are? Why their delisting procedure claims it does dns lookups when in fact it is caching (incorrectly I might add) ttl data and counting down?

  5. Bob Smith said:

    Please ask Michelle or what ever he/she calls themself. Why is the removal system such a pain to be removed? I’ve been trying for over a year. One snappy email and they avoid you like the plague. I just tell everyone SORBs suck and leave it at that. If they question me about my opinion I say Google SORBs Sucks and read it for yourself.

  6. Todd Glassey said:

    SORBS is a nightmare. The service is abusable and has been used to cause immeasurable damage to the accessibility of individuals based on negligent email practices.

    The problem with SPAM is 100% tied to the refusal of Email Systems Operators to enforce proper HEADER FILTRATION. That’s it – if a EMAIL Provider enforced the header info verification they would have to put in place larger systems. But this was always intended in the design of SMTP.

    The reason SPAM exists at all as a possibility is lazy network and DNS administration practices. And with the proper SMTP security features installed and enabled most all SPAM can be stopped cold.

    The problem is the network providers need the spam bytes to meet their backhaul agreements and anyone who cannot do the math such that this is clearly demonstrated to them should just give up.

    SORBS then while it may have been a good idea is a major pain in the Ass now. I speak from personal experience with parties listing commercial email providers with our domains on them and make it impossible to scrub them.

    IMHO – Legal action is the ONLY Solution and SORBS and its management team need to be held accountable for the damage they have done to legitimate parties in interfering with their ability to route and deliver email over the Internet.

    Just my two cents.

  7. michael said:

    I have to start that I completely agree with sorbs and the idea behind it..
    However one of my users recently acquired a bug that sent emails via my server and the result was that the server was black listed.. fine.
    On this occasion I found that I had to create a account with sorbs in order to de-list my server (after disclosing my personal information including things like my address etc.)…. fine
    Delisting the address became a nightmare because they required a key to be published that the website would not publish and when I attempted to send a ticket a robot kept closing it… A reply to this ticket from my email account would re enable the ticket however as my ip address was blocked I was just getting errors from sorbs to advise I was on their black list.. Quality… NOT.
    Can the procedure be looked in to ensure that the little guy can function?
    I don’t believe that the procedure should be that hard to make a de-list request?? At the end of the day I am now aware of any problem and sorbs now have all my contact details so they can contact me if they believe a problem still exists.
    SORBS if your reading please look in to..
    Michael (a postmaster)

  8. adrianTNT said:

    My IP got listed, they consider it spam, emails are not close to spam, receivers agreed and confirmed their email address, then all messages contain one-click unsubscribe at the bottom, we have feedback loops with Yahoo, Hotmail and AOL.

    Now… Sorbs website is full of bugs and software errors, sending you from one page to another.

    Like someone else mentioned above, they ask you for all your registration details and only then (hardly) you get to a page that says you cannot delist your IP address.

    I replied to the ticked mentioning some software errors that I get while trying to delist my IP but an automated reply told me that they already closed the ticked and they considered it solved. I don’t even want to get into the crappy usability of their online contact form steps.

    I cannot see why someone would pick them to decide what is and what is not spam.

    These people are worse than spammers themselves.

    And they are NOT “fighting spam” as they say, they are actually fighting legitimate messages.

    People like them should have no place in web business.

  9. vector said:

    I work in IT nearly 25 years. I must say that with more arrogant asshole I have never met. Level functionality of their website is school project. demented spamlist for stupid people. I do not understand the meaning of this conversation. Mühle we ask someone to give us a more intelligent answer. “Hello wall, what do you think?”

Leave a Reply