Alex Bainbridge's Musings on travel ecommerce blog
Musings on travel ecommerce blog
Blog home  Blog home

Is screen scraping ethical? [Discuss]

Tuesday, August 26th, 2008

I admit I was shocked when I read that 300 travel websites have been accused by Ryanair of screenscraping their site (and have therefore been served a cease and desist legal letter). 300 does seem a rather large number.

Taking into account that some of the websites mentioned are duplicates (for example various regional versions of the same site) and others are 3rd party distributors of product from websites also on the list and you will end up with a smaller number. However the number will still be much more than the 5-10 or so I initially thought it was.

Anyway, if it was just a few sites you could say that screenscraping was done by “the few”. These few would be those undertaking business either as “cowboys” or because they have understood the business risks, accepted the risk and screen scraped anyway.

However, to me, its a completely different story to find that most of the leading online travel websites have been accused of being part of this Ryanair story.

The ethics of screen scraping

As a developer, if someone asked me to develop a system for sending spam email I would object. Indeed I have seen developers turn around to product managers and say they are not building something that can be used for spam email. Although being a developer doesn’t quite have the same maturity of professional ethics as a house architect - there are some that are generally accepted to exist.

I believe that all developers who consider themselves to be a professional would think and act the same way. Spam email is a no no.

However, if as a developer you are asked to write a screen scraping tool - would you object? Is screen scraping acceptable when you know it is against the terms & conditions of the object site?

Consider this - if you are a developer working for a leading online travel agent that screen scrapes…. roll forward a few years when online travel agents will (may!) be overtaken by supplier websites (such as tour operators and direct bookings with airlines). If you are sending your CV to a supplier - will you mention on your project list that you have written a screen scraping tool?

Would you be hiding that - or proud of it?

If you consider that you won’t be proud of writing a screen scraping tool….. then you probably ought not be doing it now!

What do you think? Is screen scraping ethical?


If you want to be notified next time something is published sign up for email alerts or subscribe to the RSS feed. Thank you for reading!





More posts (maybe related, maybe not)


19 Responses to “Is screen scraping ethical? [Discuss]”


  1. August 26th, 2008 at 11:20 pm
    Stephen Joyce

    I would object to building a screen scraping tool. The reason is simple, if a company wanted to share their data and booking capabilities, then they would build a platform or API to do so. If they don’t, then they are sending a clear message that they are either not technically capable of building one or sharing of their data is not a priority. In either case, screen scraping their site is the equivalent of hacking their site (taking the html code, disassembling it, stripping out the data you need, re-engineering and then posting the content back to the server).

    My feeling is that if they don’t want to provide an API then they can play by themselves. With so many OTAs and agencies looking for products to sell, why should they have to resort to scraping a site so they can tag on a small booking fee to the transaction. I don’t think it is up to OTAs and technology providers to force “sharing of data” on suppliers. In my opinion, sharing of data is in the best interests of the supplier and they should be the ones embracing it. Not sharing is a business decision that the rest of the community can disagree with, but should respect.

  2. August 27th, 2008 at 1:07 am
    Stuart

    Agree with Stephen — is there an API? Yes — then use it, No — please go away.

    Point 2 of Ryanair’s TOU is a pretty clearly cut No to the screen-scrapers.

  3. August 27th, 2008 at 9:35 am
    Alex Bainbridge

    Thanks Stephen

    To summarise, you believe: Screen scraping = hacking

  4. August 27th, 2008 at 9:42 am
    James

    Well Ryanair’s booking engine is certainly very slow but I supect that there are other reasons for that.
    I can understand both their business reasons (low air fares so they want to be able to get the hotels etc bookings themselves), and the operational reasons (impossible to contact the customers if e.g. schedule changes), for Ryanair’s enforcement against screen scraping. I wonder if they will see a drop in load factor though?

  5. August 27th, 2008 at 7:03 pm
    Stephen Joyce

    Yes. I believe screen scraping is hacking. Let’s use a non-travel example. I build a website that uses a screen scraper that allows you to log into your on-line bank accounts (all them one in one place), transfer money, check statements, and do all those other things and then the site charges a small fee to do it. I’m 100% sure that the banks would shut it down, not to mention that most people would recognize that this is probably not a safe, ethical, or secure thing to do. So why would anyone in travel think it is okay to do the equivalent with suppliers? It just doesn’t sit right with me.

  6. August 27th, 2008 at 7:12 pm
    Alex Bainbridge

    Hi Stephen,
    I agree with you!

  7. August 28th, 2008 at 1:14 pm
    Alastair James

    If screen scraping is ‘hacking’ then you should also include google and the likes as ‘hackers’. What is a search engine crawler if its not a ’screen scraper’?

    The difference is how the data is used. Some companies on the ryanair list allowed you to book with them (sometimes with a additional fee). This is of course wrong, and ryanair has every right to defend itself. However, many companies just use screen scraped data for ‘flight comparison’ purposes. In this situation there is no difference between the morals of a traditional search engine and a ’screen scraper’. There are the same thing and I cant see how they would be doing anything except helping ryanair.

    Imagine if google started using its ‘crawled’ (screen scraped ?) data to start showing little links saying ‘flights from £xx’ under ryanair’s listing for searches of the form ‘cheap flights to madrid’. I cant see anyone complaining then, but is the same!

  8. August 28th, 2008 at 1:19 pm
    Alex Bainbridge

    Hi Alastair,
    Thanks for your comment. The difference is that, for search engines, you can (as the web master bod), opt out by using a robots.txt file.

    There is no way you can opt out of being screen scraped….. (except through technical means which are difficult to do without blocking consumers accidentally) - or through legal / commercial means. Ryanair seem to be taking the legal / commercial route.

  9. August 28th, 2008 at 1:28 pm
    Alastair James

    Yes, I agree. ‘Screen scrapers’ should abide by the robots.txt file. However, just looking into ryanair as an example, all availability searches occur on the ‘bookryanair.com’ domain. However there is no robots.txt file on that domain (it should be http://www.bookryanair.com/robots.txt), so nothing to ‘opt them out’!

  10. August 28th, 2008 at 2:23 pm
    Andri

    Adding extra cost to the fair is cheating and unethical. Simply helping people find better deals and forwarding them to the appropriate airline sites is not unethical in any sense of the word. You have to make clear distinction between these two cases as Alastair points out.

  11. August 28th, 2008 at 2:25 pm
    Alex Bainbridge

    Hi Alastair,
    That is a very interesting comment.
    I am not convinced that all the alleged screen scrapers would STOP if Ryanair did have a robots.txt file…… but I think, if I were Ryanair, I would have placed one on my server before making this big legal and PR fuss!

  12. August 28th, 2008 at 5:54 pm
    Stephen Joyce

    @ Alastair and Andri,

    I agree, certainly in the case of the sites that just display pricing and forward the consumer back to Ryanair for booking, this would be similar to the traditional “comparison shopping engines” used for traditional goods like cameras, clothes, and electronics, so the “hacking” comparison doesn’t really apply. In my example, however, I am specifically referring to sites that interfere with the booking process or manipulate the data in some way.

    The other point I want to make is that Ryanair has made a business decision to offset the discount/low cost airfares by generating revenue from ancillary product sales, such as hotels, car hire, excursions, and in flight extras, none of which can be supported through the sites that are screen scraping. The airfares are priced to drive consumers to their site in order to purchase other goods and services. By focusing on the airfares, the screen scrapers are making it harder for Ryanair to maintain low cost fares and as a result reducing their profitability. The screen scraping essentially subverts Ryanair’s business model, which is why Ryanair has to protect itself.

  13. August 28th, 2008 at 7:16 pm
    Alastair James

    I suppose the point I wanted to make is that I think screen scraping can be ethical as long as a) You pass the user through to the end site to book and b) your crawler respects the robots.txt file. Indeed, in my mind screen scraping = search engine if those constraints are met.

    @Alex, Try a search for site:bookryanair.com in google. They obviously have not had a robots.txt file for a while!

    @Stephen, I see the point about loss of revenue. I guess at the end of it all, ryanair is trying to become a ‘travel site’ where you can book it all. They must think that playing nice with other ‘travel sites’ is not good for that. A bit short sighted, I would have though a better tactic would be to allow ‘nice’ screen scrapers but monitise the booking process better. I.e. sell the ‘extras’ during the booking process that the ‘nice’ screen scraping site has referred them to.

  14. September 2nd, 2008 at 11:59 am
    Alastair James

    I guess Ryanair agree with me!

    Ryanair offers meta search engines an olive branch

  15. September 2nd, 2008 at 12:13 pm
    Alex Bainbridge

    Hi Alastair,
    Interesting turn of events yes…. but not sure you can argue that Ryanair now agree with you.
    The ethics discussion has always been around screen scraping WITHOUT permission - all Ryanair have done now is give wider permission!
    Alex

  16. September 2nd, 2008 at 12:16 pm
    Alastair James

    Sorry, I meant about ‘nice’ screen-scraping being a benefit to their business.

  17. September 2nd, 2008 at 8:16 pm
    Alex Bainbridge

    Hi Alastair,
    True.

  18. October 10th, 2008 at 7:50 pm
    Aaron Helton

    I am going to have to completely disagree with the idea that screen scraping is unethical in and of itself. It really depends on how the content is being used. If you’re talking plagiarism (a very common occurrence on the Web), that’s one thing, and in no circumstances is plagiarism ethical. However, there’s a continuum between commercial and non-commercial application that can be considered here, and it bears looking at a little more closely. If you build a screen scraper to aggregate data that has been made public by virtue of the Web, but the author or owner of the data lacks the technical capability to provide an API for easy retrieval (RSS for instance), then it should be fair use to create such an API for simple utility purpose, so long as the links lead back to the originating site. We can argue all day about the contents and merits of the various TOS agreements and how they apply to any set of uses, but in the end, when people publish things on the Web, they become accessible in ways that the author may not have understood, wanted, or expected, or known. It is nevertheless the responsibility of the content owner to learn how these things work.

    Now, if someone goes out and builds an application that exposes information that normally requires authentication to see, you might have a bit more of a case as the owner, and the ethics of building such a system are definitely in question. But if it’s publicly available without authentication, I say it’s fair game until the owner asks you to stop. Otherwise, how can anyone ethically justify the existence of sites like Digg, where a portion of the site’s content may be copied to fill out a description, and which frequently has the effect of bringing down the servers that host content that becomes popular.

  19. October 10th, 2008 at 9:40 pm
    Alex Bainbridge

    Hi Aaron
    Yeah I should have probably framed this as being a travel industry specific question.

    Airlines are going out of business and need revenue that they can earn from selling “extras” to their customers (e.g. a hotel stay). However, in the travel business, some of the screen scraping companies are minimising the revenue that the airlines (like Ryanair) can earn.

    In a situation where the airlines have said they don’t want screen scrapers, where the industry knows the airline doesn’t want screen scrapers- and where we are all in the same (small) industry - the ethical question comes down to whether website should continue down this path or whether the airlines are “fair game”.

    Digg is a benefit to both the end site - and the “consumer”. The airline situation is a negative (which is worse than being of no impact at all)

    Thanks for your comment though. I think you made good points for the general “web” - and actually that is where some of the issue in the travel industry is - as “web ethics” are not quite aligned to “small industry where everyone knows everyone” ethics.

Leave a Reply


Comments for this post will be closed on 10 October 2009.




This blog is about travel ecommerce with a focus on topics of interest to tour operators & travel companies

Alex has previously started up a small tour operator (5 staff) and also worked for leading "dot coms", airlines, hotel chains and tour operators advising and project managing web, ecommerce and reservation system projects.

Alex is available for travel ecommerce consulting via Travel UCD. Travel UCD also operates TourCMS - a web based reservation system for small tour operators


RSS Feed

Subscribe via daily email



AddThis Feed Button

Homepage
About this blog
Best of the blog (top 10 posts!)

Recent comments
Tamara: It’s a lot of money! But I guess it’s probably good value for the column inches it generates - of course as long as you get to the top five! To guarantee that it looks like you have to have...

Alex Bainbridge: Hi Tamara …. as for PhoCusWright….. I am sure that at the point the judges judged they were impartial - however it was a fairly self selecting group who put themselves forward to be judged...

Darren Cronian: Alex, I am worried that we are becoming on the same wave length. http://www.traveldotnet.co.uk/ articles/lets-not-forget-offli ne-travel-innovation/ No, I have just read this post now, I didn’t...

Pete Meyers: Alex - I’m really looking forward to hearing the pirate story, well done!

Ben Colclough: I must say I had more fun acting out a chicken in a restaurant in Yunnan, China than I would have had with the flip book. Seriously though - it is a good idea & innovative. Not sure I would want to...

Alex Bainbridge: Hi Pete The times I would have found this useful (PocketComms) I really wouldn’t have wanted to put an iphone into someone elses hands! For example negotiating with a people smuggling ship in...

Pete Meyers: I think the best innovation is a combination of great ideas and succinct execution. To your example about the PocketComms, it was a good idea that fermented for a number of years, yet who’s to say...

Tamara: This is an interesting debate. I wonder what the PhocusWright judges views are. They seemed to be very clear however that they wanted to reward companies who had actually created something - rather than simply...

Ben Colclough: P&G, generally regarded as a very innovative large consumer branded company has an approach to innovation that throws some light on this. They embrace failure as a necessary part of innovation. This...

Categories
Top commentators
Kevin May
Darren Cronian
Jeremy Head
John
Ben Colclough
Alex Bainbridge
graham steele
Ian McKee
Big Travel Web
Tamara
Guillaume
Ignacio
Neil MacLean
Dominic
John Pyle

Other travel & tourism blogs
Travolution
The Boot
Hotel Blogs
Travel Rants
TraveBlather
Travel PR Blog
Dot Tourism
Albert Barra [Spanish]

Wiwih blogs - a directory of travel industry blogs

Small Fish Big Ocean

Come and join my travel business social network! for small tour operators and niche agents


TourCMS