Thank you to those who gave me, and the company, the time to investigate the claims made. It’s very easy in the digital age to simply hit “retweet” and join the virtual mob, and to those who hung back to wait for the truth, thank you. Separate to the public discussion being played out over Twitter, there’s been a parallel discussion between Rich Tomko, CEO of Snooth, and Eric LeVine, Founder of CellarTracker, as we all work to understand what the genesis of the issue is.
To be clear, we do not scrape data, we do not steal data. Typically, data on Snooth is sent to us, uploaded via our Merchant Hub, or is crawled using web standards (robots.txt, which is how Google and other search engines work).
I think CellarTracker provides a great service to serious wine collectors out there. I know this, because members of the team here continue to rave to me about how awesome it is. It was because I knew this, that way back in April 2007 when Snooth first launched I negotiated an agreement with CellarTracker to feature reviews from their site (announced here). The relationship evolved over the ensuing six months, and ultimately in October 2007, the partnership ended and we were asked to remove “user reviews" -- which we did, in its entirety.
At the end of October 2007, CellarTracker wrote to us asking us to remove not only the reviews (which had been done), but to take down other pieces of content on our site. This is where everything gets complicated, so bear with me as I explain how Snooth is built.
Snooth is an aggregator -- we collect data from around the web. We’re more than that today, with a thriving social community and a strong editorial component, but back then when there were just two of us and we worked off my co-founder's kitchen table in Brooklyn, to make our efforts stretch, we decided to go build a “Google meets Expedia” for wine.
Today we aggregate information from over 10,000 sources, with thousands of these sources actually sending us information in the form of Excel and csv files, or data feeds. All of this is managed via our Merchant Hub, which is an incredibly sophisticated tool able to recognize and fix wine data errors on the fly. It’s something we are very proud of, and continuously working to improve. In addition to the Hub, we have the Snooth Crawler. This is essentially a robot that behaves like a mini-Google, wandering around the web looking for wine content to add to the site. We currently monitor more than 50 million pages focused on wine, and we check in every few days to see if there’s any new information on each page. The Snooth Crawler “behaves” well: We don’t visit sites too frequently, and we leave a signature behind so that webmasters can see that it was us calling, and if they want to stop us from coming, it’s simple to request that and our Robot will not visit that site again.
To date, no site has ever asked to be removed from our index. It sometimes takes some explaining (“What are you doing with my data?”), but site owners mostly recognize the value of what we’re doing: namely to raise a winery’s visibility of their products and to drive sales to both wineries and retailers. It’s the wine equivalent of Google, Expedia, or CitySearch.
Back to CellarTracker. We realized yesterday that some of our wine descriptors, namely the user tags (not reviews or information that is, or could be, copyright-protected) that are automatically extracted from user reviews, were still being calculated using CellarTracker information as one of the thousands of data sources. This data was being pulled via the original CellarTracker XML feed, which was set up under the 2007 agreement. Once we discovered that this data is still contributing to the compilation of these tags, and having been informed by CellarTracker about their position on this issue, we immediately switched the feed off and have begun the process of extracting the information from the site. The process will take several days as we have to recalculate the wine tags across over 3 million wines.
Snooth’s code base, the guts of the code that make the site live and breathe, has been written by tens of people over four years, involving more than 10,000 individual code submissions and over 1 million lines of code. Amongst that spaghetti of code, this routine slipped through -- specifically, the code that pulled the CellarTracker feed was not switched off when the agreement between us ended, although no reviews went on the site, the wine tags were still being calculated with CellarTracker as one of the sources.
For that I’m sorry. To Eric LeVine, who single-handedly has built CellarTracker into the best tool for the serious wine collector, and to any CellarTracker users who feel their data was misused, my apologies.
Again, thank you to those who gave us the time to understand the situation and to respond appropriately.