About the web archive of the European Union institutions
Table of content
- Why do we archive websites?
- Who can use the archive?
- What do we archive?
- How does it work?
- Guidance for website managers
- Takedown policy
- Legal information
- Find out more
Links to the EU web archive are currently unstable.
We are working hard on a solution and apologise for the inconvenience.
If you are looking for archived content, please contact the EU web preservation team: firstname.lastname@example.org
1. Why do we archive websites?
More and more EU information is only made available on the web. However, web content often has a short lifespan and web technologies evolve quickly. So this information is at risk of getting lost, e.g. when websites are substantially changed or taken offline.
This is why we created a web archive from 2013 for the EU institutions, agencies and bodies (the EU institutions).
This archive reflects the content and design of websites as it was at a given point in time. Thanks to it, the information institutions provided on their websites stays available, even if the original site or page has fully or partially disappeared.
2. Who can use the archive?
The archive is open and available online. Everyone with an internet connection can consult and use it.
3. What do we archive?
The websites of the EU institutions. Most of these are hosted on the europa.eu domain and subdomains and are archived on a regular basis.
Ad hoc crawls of websites that will be taken offline or will change substantially can be done on request of the respective EU institution. For example, we can archive pages created for certain events.
For the moment, we don‘t archive the following
- Links to external websites. When archived sites contain links to an external website you will be redirected to the live website. E.g. if you find a link to the United Nations website (https://www.un.org/) on an archived version of the Europol website, you will be directed to the live version of the UN site.
- Dynamic content.
- Social media.
- Databases. This means that searches will not work, neither will links based on search queries.
4. How does it work?
Websites are archived four times a year using a web crawler. This crawler visits and explores the selected websites by following hyperlinks — much like a human user would — and copies the pages and files it comes across.
Users can navigate the archived sites like a live website. However, archiving with a crawler has some technical limitations and as a result certain features may not work, including the following:
- the original website’s built-in search;
- content that can only be reached after logging in;
- certain navigational elements, e.g. drop-down menus, tick boxes and some maps;
- flash animations and games, streaming media and embedded social media;
- POST functionality.
5. Guidance for website managers
Preparing websites for archiving
- To optimise the quality of the archived versions of your website, keep these best practices in mind: creating and maintaining preservable websites. This is especially important for your homepage, as that is the entry point for the crawler.
- For more information, see the Information Providers Guide
- Remove content that should not be preserved (and be accessible) in the long term. This can be for reasons such as intellectual property rights (e.g. copyright), confidentiality, privacy, data protection, etc.
- If this content cannot be removed before archiving, prevent it from being crawled by using robot.txt files.
6. Takedown policy
There are legitimate circumstances when it may be required to hide pages in the web archive from public view.
Anyone can submit a motivated takedown request. Please use this email link to initiate it: email@example.com.
Takedown will only be considered in one of the following cases:
- if the page includes one of the following types of content:
- personal or sensitive personal information, as defined by Regulation (EU) 2018/1725 on the protection of natural persons with regard to the processing of personal data by the Union institutions, bodies, offices and agencies;
- copyright protected material for which the necessary rights are not held;
- defamatory or obscene material or messages;
- if the content of the page may cause serious and real administrative difficulties to the website owner;
- if the page was published in good faith, but circumstances for this have changed and its takedown is now considered appropriate;
- if the page was published in error and its takedown is deemed necessary to correct this mistake.
7. Legal information
© European Union, 2019
The Publications Office carries out web archiving to preserve the websites of the European Union. Most of the archived content of the websites that are accessible in the EU web archive (EUWA), is under EU (or EU institutions, agencies or bodies) copyright. Ownership and copyright of websites in the EUWA remain the responsibility of the website owners.
Unless otherwise stated, the material obtained from the EUWA may be freely reproduced. This general principle can be subject to conditions, which may be specified in individual copyright notices. It does not apply to photographs, videos, pieces of music or other material subject to intellectual property rights of third parties (non-EU). In such cases, permission to use the material must be sought directly from the copyright holders. The Publications Office does not warrant that all third-party content is appropriately marked.
All logos and trademarks are excluded from the abovementioned permission.
Any queries regarding the above should be addressed to the following email OP-COPYRIGHT@publications.europa.eu
8.Find out more
Contact the web archiving team: firstname.lastname@example.org