Scott Greenfield over at the incredible blog Simple Justice recently noticed that his site’s RSS feed was being republished on the new Law Ratchet website. The site seems to be a curated collection of legal blogs republished in such a manner as to appear to be content from the website itself (instead of operating as a targeted RSS feed reader and curator).
We had far more of a discussion than is appropriate on Twitter, and he took a few minutes to put together a post on the copyright infringement. I then found that my thoughts were far too lengthy to fit into his comment box.
While I think Law Ratchet is in the wrong here, I do see the potential for a similar service that not only respects the wishes of copyright holders, but provides significant value for those who wish to read their law-related RSS feeds in a single, attractive, web-based interface.
For me, “scraped” content actually involves screen-scraping, a data collection method most frequently associated with content that doesn’t offer an RSS feed. For example, I’ve used scraping to pull information from a county recorder’s website to populate a database of deed of trust records for analysis. This type of scraping is hard to do and even minor changes to the page being scraped can break the system.
RSS, on the other hand, provides a feed that is designed to be used by the recipient in a way appropriate to their need. It’s formatted for easy use and separates the various elements of the document so that it only need be parsed and formatted for viewing.
An entirely appropriate use RSS feeds by subscribers is through a web-based feed reader. Other uses might include using the feed in a purpose-built reader app on an iPhone or even using the content to automatically file copyright registrations. Few feed publishers would take issue with a web-based feed reader. After all, Google Reader and Bloglines before it have long been incredibly common tools for consuming RSS feeds (including your own). New tools like Flipboard also make use of your RSS to display your site’s content for subscribers. Few have objection to them and with the coming demise of Google Reader, there’s a new market for web-based feed readers.
I understand the objections to sites that use scraped content. In general, these sites are simply using the content of others to generate traffic (along with advertising revenue or some SEO benefit) for their website. These sites try to make the source content appear as their own, serving not as a reading tool for willing subscribers, but co-opting the content for their own gain.
Here, it seems that Law Ratchet is using only RSS feeds as their data source. I was unable to find an example of them engaging in traditional screen scraping. At the same time, though, it’s clear they’re not just using the feed as it is provided: A recent Volokh post appears on the Law Ratchet site and is missing the “Email, Add Comment, and Share on Facebook” links that are present in the bottom of the feed entry. Most troubling, some of the feed content that remains has been altered. For example, paragraph and blockquote tags are rearranged to suit Law Ratchet’s formatting desires instead of that of the publisher.
They almost go out of their way to hide the source site – Instead of linking the headline to the real post, or the title of the site to the real site, the only link back is a very small “View Original” button at the bottom of the page, next to numerous other buttons that perform internal actions on their site. In most cases, one of those internal function buttons is actually labeled with the name of the source site. Clicking it, though, keeps you on the Law Ratchet site, making it appear that major website are simply subparts of Law Ratchet. This provides no real value; it just serves to repurpose the work of others for their own gain and creates confusion as to the origin.
While there is some value in having someone else curate a collection of sites containing high-quality content; Law Ratchet doesn’t seem to offer much ability to customize their choices. I can’t see how to remove certain blogs from their sections if I don’t want to read them. I can only select “key words” that might interest me.
Further, they’re not taking any steps to minimize their impact on the sites they republish. Their robots.txt doesn’t restrict the web-based version from being indexed and even though they’re re-writing some of the content, don’t seem to be altering the links or presenting headers to tell search engines that this is not original content. As a result, the legitimate site may get penalized by search engines that now see their content as spam. There’s also no way for sites to automatically opt out of inclusion (short of identifying their IP and blocking it from retrieving your feed).
Clearly Law Ratchet, in addition to picking a completely ridiculous name, has made some serious errors here. I have no doubt that Mr. Randazza will ensure those are remedied promptly.
What’s more interesting to me, though, is how one could provide a similar service legitimately in a way that provides added value to the reader and keeps content publishers happy. I imagine that service being quite similar to Google Reader, but admit that others may have more novel options.
First and foremost, it’s important that the content appear only at the request of the user. Much like my Aereo service, having a remote system that I control is far different than a service that records all the content and then lets me see all of it later. Meltwater‘s clipping service, for instance, took the latter approach and found itself on the wrong side of Castle Rock Entertainment, Inc v. Carol Publishing Group, 150 F.3d 132, 145 (2nd Cir. 1998), where “the secondary use suppresses or even destroys the market for the original work or its potential derivatives, but [with] whether the secondary use usurps or substitutes for the market of the original.”
Such a site has to get out of the idea of the content being their product and instead have the platform as their product. Trying to exert control over source material by rewriting it, removing bits the site owner doesn’t like and otherwise making it extremely difficult to return to the source is a dangerous tack to take. As a platform, though, you take on different goals. You would want to scare away automated indexes by applying an appropriate robots.txt policy (Google, for instance, explicitly prevents index with a “Disallow: /reader/” instruction). You’d focus on adding value to the interface (though, I must admit, I find the minimal styling of Law Ratchet attractive) and functionality of the site.
I like the idea of curated content – someone taking the time to identify the best sources of specific types of information. I don’t want to be locked into only that person’s choices, however. Perhaps they enjoy a certain source I do not. Perhaps I have discovered something they’ve not seen. Google Reader had several themed categories of content which would auto-subscribe me to certain feeds. I made heavy use of this feature and found sites that I’d have never accessed without it. But, I chose to see those feeds and could even turn them off as desired. This is significantly different from Law Ratchet’s set-up.
One need not forgo profit in such a venture. Another case referenced in Meltwater was Harper & Row Publishers, Inc. v. Nation Enters., 471 U.S. 539, 562 (1985), which noted that the “crux of the profit/nonprofit distinction is . . . whether the user stands to profit from the exploitation of the copyrighted material without paying the customary price.” Meltwater was using sources that usually have a significant cost component attached. None of the blogs I visit, however, charge for access.
Without altering the feeds, it is entirely possible to provide advertising on the site or (a model I think would be best for a web-based RSS reader) by directly charging the users of the service. I read a lot of feeds (according to Google Reader I read 36,110 items, clicked 264 items, starred 19 items, and emailed 83 items from my 477 subscriptions over the last 30 days) and would gladly hand over a non-trivial amount of money to someone who made me a good replacement. For the moment, I’m planning to move to TheOldReader, but I’m always on the lookout for a better option.
I think there could be value in a service similar to Law Ratchet – that is, a targeted RSS reader for specific categories of sites – but I don’t think Law Ratchet hit the mark. They made something that was pretty, but lost sight of the built-in risks that come with designing something that relies on the work of others.