I admit I was shocked when I read that 300 travel websites have been accused by Ryanair of screenscraping their site (and have therefore been served a cease and desist legal letter). 300 does seem a rather large number.
Taking into account that some of the websites mentioned are duplicates (for example various regional versions of the same site) and others are 3rd party distributors of product from websites also on the list and you will end up with a smaller number. However the number will still be much more than the 5-10 or so I initially thought it was.
Anyway, if it was just a few sites you could say that screenscraping was done by “the few”. These few would be those undertaking business either as “cowboys” or because they have understood the business risks, accepted the risk and screen scraped anyway.
However, to me, its a completely different story to find that most of the leading online travel websites have been accused of being part of this Ryanair story.
The ethics of screen scraping
As a developer, if someone asked me to develop a system for sending spam email I would object. Indeed I have seen developers turn around to product managers and say they are not building something that can be used for spam email. Although being a developer doesn’t quite have the same maturity of professional ethics as a house architect - there are some that are generally accepted to exist.
I believe that all developers who consider themselves to be a professional would think and act the same way. Spam email is a no no.
However, if as a developer you are asked to write a screen scraping tool - would you object? Is screen scraping acceptable when you know it is against the terms & conditions of the object site?
Consider this - if you are a developer working for a leading online travel agent that screen scrapes…. roll forward a few years when online travel agents will (may!) be overtaken by supplier websites (such as tour operators and direct bookings with airlines). If you are sending your CV to a supplier - will you mention on your project list that you have written a screen scraping tool?
Would you be hiding that - or proud of it?
If you consider that you won’t be proud of writing a screen scraping tool….. then you probably ought not be doing it now!
What do you think? Is screen scraping ethical?
If you want to be notified next time something is published sign up for email alerts or subscribe to the RSS feed. Thank you for reading!


Blog home



I would object to building a screen scraping tool. The reason is simple, if a company wanted to share their data and booking capabilities, then they would build a platform or API to do so. If they don’t, then they are sending a clear message that they are either not technically capable of building one or sharing of their data is not a priority. In either case, screen scraping their site is the equivalent of hacking their site (taking the html code, disassembling it, stripping out the data you need, re-engineering and then posting the content back to the server).
My feeling is that if they don’t want to provide an API then they can play by themselves. With so many OTAs and agencies looking for products to sell, why should they have to resort to scraping a site so they can tag on a small booking fee to the transaction. I don’t think it is up to OTAs and technology providers to force “sharing of data” on suppliers. In my opinion, sharing of data is in the best interests of the supplier and they should be the ones embracing it. Not sharing is a business decision that the rest of the community can disagree with, but should respect.
Agree with Stephen — is there an API? Yes — then use it, No — please go away.
Point 2 of Ryanair’s TOU is a pretty clearly cut No to the screen-scrapers.
Thanks Stephen
To summarise, you believe: Screen scraping = hacking
Well Ryanair’s booking engine is certainly very slow but I supect that there are other reasons for that.
I can understand both their business reasons (low air fares so they want to be able to get the hotels etc bookings themselves), and the operational reasons (impossible to contact the customers if e.g. schedule changes), for Ryanair’s enforcement against screen scraping. I wonder if they will see a drop in load factor though?
Yes. I believe screen scraping is hacking. Let’s use a non-travel example. I build a website that uses a screen scraper that allows you to log into your on-line bank accounts (all them one in one place), transfer money, check statements, and do all those other things and then the site charges a small fee to do it. I’m 100% sure that the banks would shut it down, not to mention that most people would recognize that this is probably not a safe, ethical, or secure thing to do. So why would anyone in travel think it is okay to do the equivalent with suppliers? It just doesn’t sit right with me.
Hi Stephen,
I agree with you!
If screen scraping is ‘hacking’ then you should also include google and the likes as ‘hackers’. What is a search engine crawler if its not a ’screen scraper’?
The difference is how the data is used. Some companies on the ryanair list allowed you to book with them (sometimes with a additional fee). This is of course wrong, and ryanair has every right to defend itself. However, many companies just use screen scraped data for ‘flight comparison’ purposes. In this situation there is no difference between the morals of a traditional search engine and a ’screen scraper’. There are the same thing and I cant see how they would be doing anything except helping ryanair.
Imagine if google started using its ‘crawled’ (screen scraped ?) data to start showing little links saying ‘flights from £xx’ under ryanair’s listing for searches of the form ‘cheap flights to madrid’. I cant see anyone complaining then, but is the same!
Hi Alastair,
Thanks for your comment. The difference is that, for search engines, you can (as the web master bod), opt out by using a robots.txt file.
There is no way you can opt out of being screen scraped….. (except through technical means which are difficult to do without blocking consumers accidentally) - or through legal / commercial means. Ryanair seem to be taking the legal / commercial route.
Yes, I agree. ‘Screen scrapers’ should abide by the robots.txt file. However, just looking into ryanair as an example, all availability searches occur on the ‘bookryanair.com’ domain. However there is no robots.txt file on that domain (it should be http://www.bookryanair.com/robots.txt), so nothing to ‘opt them out’!
Adding extra cost to the fair is cheating and unethical. Simply helping people find better deals and forwarding them to the appropriate airline sites is not unethical in any sense of the word. You have to make clear distinction between these two cases as Alastair points out.
Hi Alastair,
That is a very interesting comment.
I am not convinced that all the alleged screen scrapers would STOP if Ryanair did have a robots.txt file…… but I think, if I were Ryanair, I would have placed one on my server before making this big legal and PR fuss!
@ Alastair and Andri,
I agree, certainly in the case of the sites that just display pricing and forward the consumer back to Ryanair for booking, this would be similar to the traditional “comparison shopping engines” used for traditional goods like cameras, clothes, and electronics, so the “hacking” comparison doesn’t really apply. In my example, however, I am specifically referring to sites that interfere with the booking process or manipulate the data in some way.
The other point I want to make is that Ryanair has made a business decision to offset the discount/low cost airfares by generating revenue from ancillary product sales, such as hotels, car hire, excursions, and in flight extras, none of which can be supported through the sites that are screen scraping. The airfares are priced to drive consumers to their site in order to purchase other goods and services. By focusing on the airfares, the screen scrapers are making it harder for Ryanair to maintain low cost fares and as a result reducing their profitability. The screen scraping essentially subverts Ryanair’s business model, which is why Ryanair has to protect itself.
I suppose the point I wanted to make is that I think screen scraping can be ethical as long as a) You pass the user through to the end site to book and b) your crawler respects the robots.txt file. Indeed, in my mind screen scraping = search engine if those constraints are met.
@Alex, Try a search for site:bookryanair.com in google. They obviously have not had a robots.txt file for a while!
@Stephen, I see the point about loss of revenue. I guess at the end of it all, ryanair is trying to become a ‘travel site’ where you can book it all. They must think that playing nice with other ‘travel sites’ is not good for that. A bit short sighted, I would have though a better tactic would be to allow ‘nice’ screen scrapers but monitise the booking process better. I.e. sell the ‘extras’ during the booking process that the ‘nice’ screen scraping site has referred them to.
I guess Ryanair agree with me!
Ryanair offers meta search engines an olive branch
Hi Alastair,
Interesting turn of events yes…. but not sure you can argue that Ryanair now agree with you.
The ethics discussion has always been around screen scraping WITHOUT permission - all Ryanair have done now is give wider permission!
Alex
Sorry, I meant about ‘nice’ screen-scraping being a benefit to their business.
Hi Alastair,
True.
I am going to have to completely disagree with the idea that screen scraping is unethical in and of itself. It really depends on how the content is being used. If you’re talking plagiarism (a very common occurrence on the Web), that’s one thing, and in no circumstances is plagiarism ethical. However, there’s a continuum between commercial and non-commercial application that can be considered here, and it bears looking at a little more closely. If you build a screen scraper to aggregate data that has been made public by virtue of the Web, but the author or owner of the data lacks the technical capability to provide an API for easy retrieval (RSS for instance), then it should be fair use to create such an API for simple utility purpose, so long as the links lead back to the originating site. We can argue all day about the contents and merits of the various TOS agreements and how they apply to any set of uses, but in the end, when people publish things on the Web, they become accessible in ways that the author may not have understood, wanted, or expected, or known. It is nevertheless the responsibility of the content owner to learn how these things work.
Now, if someone goes out and builds an application that exposes information that normally requires authentication to see, you might have a bit more of a case as the owner, and the ethics of building such a system are definitely in question. But if it’s publicly available without authentication, I say it’s fair game until the owner asks you to stop. Otherwise, how can anyone ethically justify the existence of sites like Digg, where a portion of the site’s content may be copied to fill out a description, and which frequently has the effect of bringing down the servers that host content that becomes popular.
Hi Aaron
Yeah I should have probably framed this as being a travel industry specific question.
Airlines are going out of business and need revenue that they can earn from selling “extras” to their customers (e.g. a hotel stay). However, in the travel business, some of the screen scraping companies are minimising the revenue that the airlines (like Ryanair) can earn.
In a situation where the airlines have said they don’t want screen scrapers, where the industry knows the airline doesn’t want screen scrapers- and where we are all in the same (small) industry - the ethical question comes down to whether website should continue down this path or whether the airlines are “fair game”.
Digg is a benefit to both the end site - and the “consumer”. The airline situation is a negative (which is worse than being of no impact at all)
Thanks for your comment though. I think you made good points for the general “web” - and actually that is where some of the issue in the travel industry is - as “web ethics” are not quite aligned to “small industry where everyone knows everyone” ethics.