Questions and Answers from TED reusers Workshop
18 October 2023
These are the consolidated Questions and Answers (Q&A) addressed during the TED reusers workshop of 18 October 2023. Replies to questions posted live during the workshop were reviewed and regrouped or modified for a more cohesive and complete response.
Questions related to filling in and submitting eForms notices are not included as they were out of the scope of the workshop. However, we have already conducted a number of workshops for eSenders and eNotices2 users and the resources of these can be found in https://op.europa.eu/en/web/ted-reusers-workshops and in https://simap.ted.europa.eu/web/simap/eforms.
DATA RETRIEVAL
- I'm concerned that there is no way to straightforwardly get the link for an XML bulk download for a given day. As I see it, you have to:
a. Manually download the release calendar and store it for use in your program.
b. Use release calendar to build the download link and trigger the download for a given day.
This is a step down from today where you can connect to the FTP and based on the date, list (execute an “ls” command) for a tar.gz file for the given day.
If the links to the daily files were given as a date (e.g. https://gamma.tedv2.spikeseed.cloud/packages/dailyYYYYMMDD) instead of the OJ S number, it would be easier to automatically fetch a daily package without knowledge of the release calendar.
Answer: The release calendar does not change each day hence it can be downloaded (manually) once for a given year and then used accordingly. - Will the (reappeared) bulk download stay? Does the download require individual access tokens or is it anonymous?
Answer: The functionality “bulk download” has always been available in the current TED website (for signed-in users) and will also be available in the new TED website (TED 2.0). The only difference between the current TED website and the new one is that in the new one there is no need to sign-in to download the daily or monthly packages. - Will XML format tender notices be available in the future?
Answer: If by “XML format tender notices” is meant public procurement notices compliant with the TED schema (not eForms), the answer is yes, they will exist in TED 2.0 as all the procurement data (notices) that exist in the current TED website will be migrated to TED 2.0. If by “XML Format Tender Notices” is meant whether the format of a notice will continue to be available in XML, the answer is also yes, since it is the format that the public procurement notices are sent by the Contracting Authorities (Buyers) for publication on TED and it is available for download individually (as a single notice) or as a bulk (in a package of many notices). - When will the HTTPS be available to download XML?
Answer: Please see the answer to question 2. - As the FTP access gets shut down, how do we access the bulk download of daily packages? A curl example makes my problems with name guessing understandable.
Answer: In the TED 2.0, the user can download packages using a direct link:
The URL format for the daily packages is https://{ted-url}/packages/notice/daily/{yyyynnnnn} where {yyyynnnnn} is the OJ S number.
The URL format for the monthly packages: https://{ted-url}/packages/notice/monthly/{yyyy-n} where {yyyy-n} is the year (yyyy) and the month (n).
It should be noted that the {ted-url} part for TED 2.0 will become “ted.europa.eu” on the date when TED 2.0 goes live, that is, when TED 2.0 replaces the current TED website. - When will the FTP server be dismissed?
Answer: The FTP server will cease to exist when the current TED website is replaced by TED 2.0. - Question regarding the daily package download: in FTP, I can select the package to download by looking at the upload dates, regardless of the OJ S Number or the specific date. In HTTPS download I would have to "calculate" the correct OS J number, which seems an unnecessary overhead.
Answer: Please see the answer to questions 5 and 26. - Will the Base URL of the production environment remain as "https://ted.europa.eu/xml-packages/daily-packages/"?
Answer: Please see the answer to question 5. - At what date will the FTP no longer be available to download notices in bulk?
Answer: Please see the answer to question 6. - Is there a direct link to the release calendar in CSV format for a given year? We'd like to automate the download and processing, but the link behind the download button seems complex.
Answer: The TED 2.0 has no direct link to download the release calendar in a given format - So, from 25/10/2023, eSenders need to exclusively use eForms. Does this mean that the daily TED bulk download will only contain the new format (the XML files that start with 00 and has a length of 8 digits)?
Answer: The bulk download will continue to contain the notices sent by the Contracting Authorities. If the Contracting Authorities also send TED schema notices, these will be included in the bulk download. There is no impact for the reusers as the bulk download already contains both TED schema and eForms notices and the only thing that varies is the volume of each schema type published. - We have an automation to download the daily tenders from a URL like the: https://ted.europa.eu/xml-packages/daily-packages/2014/12/20141202_2014232.tar.gz
Will this break? Do we need to make a new solution/automation?
From what date will we need to make the change to the new URL format to download daily tenders? If we make it today, will it work normally?
Answer: Your current automation will break when the TED 2.0 goes live, that is, when TED 2.0 replaces the current TED website. The test (“Gamma”) environment is available to help the reusers adapt and test their services before TED 2.0 goes live. - For reusers of TED FTP, who want to switch to the TED Semantic Web Service (SWS), when will it be ready?
Answer: Please see the answer to question 35. Furthermore, depending on what each reuser is using from the TED FTP, they can evaluate whether they can be covered by what the SWS offers at a given point in time or not. - Can I get the HTTPS URL that will replace the FTP server?
Answer: Please see the answer to question 5.
VISUALIZATION - Understanding and parsing the content of the new XML format (eForms). What is the intended way of reusing this information besides from rendering? Why do we have to rely on a rather slow API to obtain a full text representation of an xml file?
Answer: There is extensive documentation – information related to eForms available in TED Developer Docs website (https://docs.ted.europa.eu).
Please see also the answer to questions 17 and 29. - I came across some HTML bugs in TED viewer API; I found that in current production version of TED eForms viewer API some eForms-XMLs that are currently published were not rendered at all.
Answer: There was an issue to render specific eForms notices due to the existence of special characters in the XML itself. This has been corrected and the eForms notices can be now visualized properly. In any case, users are invited to report any issues found while using the Gamma environment to the TED HelpDesk. - It seems like the existing Ted Viewer API is not yet working with eForms. Will there be a new API or will the existing one be updated?
Answer: The following visualization APIs are currently publicly accessible. The ones mentioned in SIMAP in the page “Developers' corner for eSenders and Reusers” under the section “View notices in various formats” (https://simap.ted.europa.eu/en_GB/web/simap/developers-corner):
The first one (https://ted.europa.eu/TEDWS/swagger-ui.html) is used for the rendering of the TED schema notices and the second (https://ted.europa.eu/TEDVIE22/swagger-ui/index.html) is used for the rendering of the eForms notices. These two will cease to exist as soon as TED 2.0 goes live.
However, there is one more publicly accessible visualization API that is used only for the rendering of eForms notices and is functionally the same as the one mentioned above.
Its only differentiation compared to the one mentioned above, is that it requires an API key to work. More information for this API can be found in https://docs.ted.europa.eu/api/index.html and in https://viewer.ted.europa.eu/swagger-ui/index.html.
This visualization API will continue to exist as it is independent of the TED website (both the current and the new one). - Currently the HTML returned by the eForms viewer notice API does not return the same HTML as the direct "HTML download" links. Is it normal?
Answer: An HTML returned by the eForms viewer API compared with the one from “HTML download” are in principle the same in terms of the notice content and its visual structure. However, they have some differences in the styling (fonts etc.) and the former also includes a table of contents. Please also see the answer to question 17. - Last year you presented us with https://github.com/OP-TED/eforms-notice-viewer which converts eForms into HTML, but it is slow. Will this project evolve and is it maintained? It would allow us not to go through the APIs of the TED website and would allow us to have the HTML more quickly.
Answer: Please see the answer to questions 17 and 38.
DOCUMENTATION - Insufficient or lacking documentation. Differences between documentation, APIs, and data. Lack of backward compatibility of data formats.
Answer: Please see the answer to question 44. As regards the lack of backward compatibility between the TED schema and eForms data formats, this is known since end of 2019 when the first version of eForms schemas were published. In any case there is extensive documentation – information related to eForms available in TED Developer Docs website (https://docs.ted.europa.eu). - On the page below, the links to the eForms specifications (at the bottom of the page) do not work: https://docs.ted.europa.eu/eforms/latest/schema/schemas.html.
Answer: An additional explanation has been added in the page since the names listed in the table shown in this page are URIs (i.e. Uniform Resource Identifiers) used to uniquely identify the domains. They are not URLs (i.e. Uniform Resource Locators) to any online resource hence they are not clickable links anymore.
TEST ENVIRONMENT - Stability of the testing environment and platform.
Answer: We try to keep the test environment (Gamma environment) as stable as possible. However TED 2.0 is still under development, and fixes may need to be applied, therefore there can always be an intervention. In any case, we try to keep them at the minimum and avoid any impact to the reusers, at least during the working days and hours. - Regarding the test environment. So far, I had only time to look at the webpage etc. Can we from now on also test bulk download and direct links to download HTML and PDF versions as well? This would give us all time to have our environments ready by January when the new website and download features will be due.
Answer: Yes, the test environment that the Publications Office made available offers all the functionalities of the TED 2.0 so any user can review them and test the services that they will offer based on it. - In the last days, the bulk download on the Gamma environment was delayed (sometimes by days). Is this fixed now?
Answer: Yes. Please also see the answer to question 22.
MISCELLANEOUS - Missing translations / missing full texts.
Answer: The TED 2.0 is not ready yet in terms of its editorial content, translations etc. In any case, the Gamma environment was made available to the reusers in order to get acquainted and test the TED 2.0 APIs, data retrieval, links to notices etc. and be prepared when it will replace the current TED website. - What is the OJ S Number? How can we get it by date?
Answer: The OJ S number is the number of the Supplement of the Official Journal. In TED 2.0, the user can download the annual releases via the page https://{ted-url}/en/release-calendar. The file contains the OJ S number associated with each publication date for the specific year. - Do I need to extract - download URLs from XML bulk download page as the OJ S number is unknown?
Answer: Please see the answer to question 26. - Would it be possible to generate an index of all files in the parent directory of the daily archives?
Answer: It’s up to the users to build it if they need it. Please see also the answer to question 26. - What is the official new direct link structure to a notice?
Answer: Format of the URLs of the direct link:
https://{ted-url}/{lang}/notice/{publication-number}/{format}
where:
{lang} is the language,
{publication-number} is the publication number,
{format} can be:
html: to display the HTML download file
pdf: to download the notice as PDF
pdfs: to download the notice as signed PDF
xml: to download the notice as XML.
Direct link of the notice view page: https://{ted-url}/{lang}/notice/-/detail/{publication-number}
Backward compatibility with the current TED example:
The https://{ted-url}/udl?uri=TED:NOTICE:735065-2022:TEXT:EN:HTML is redirected to https:// {ted-url}/en/notice/-/detail/735065-2022 - Will you have a redirecting mechanism, so that currently working direct links to individual Tenders / Notices will be redirected to the new URL scheme after the launch of the new TED website?
Answer: Yes, in TED 2.0 there will be a backward compatibility with the current TED direct links. Example:
The https://{ted-url}/udl?uri=TED:NOTICE:735065-2022:TEXT:EN:HTML is redirected to https:// {ted-url}/en/notice/-/detail/735065-2022. - The machine translation links on the notice details does not work for me.
Answer: In TED 2.0 and in the section “Machine translation HTML” of the Notice View page for a procurement notice, there was a bug when a user put the mouse over the languages’ labels. It showed them as links (URLs) without this being the case. This has already been corrected. In any case, the machine translation (eTranslation) in TED 2.0 works in the same way like in the current TED website. - What is the date on which the new site will be put into production?
Answer: TED 2.0 (the new TED website) is planned to go-live in January 2024. This still needs to be confirmed and, in any case, the specific date will be announced in advance. - It would be great to get a couple of weeks’ notice before the new TED goes live. Are you going to email us when you have the exact date for going live?
Answer: Yes, the specific date will be announced – communicated in advance. - Why is for example https://gamma.tedv2.spikeseed.cloud/de/notice/632932-2023/html not in German?
Answer: The official language of the notice 632932-2023 is Dutch (NL). When the user downloads the notice in German, all the information that comes from code lists is available in German. On the other hand, the fields that have free text are not translated. To also get the free text fields in German, the user must request an eTranslation in German. - Is the RDF link (TED Semantic Web Service - SWS) available?
Answer: The SPARQL endpoint is available; for further info, documentation etc. please consult the section “TED Semantic Web Service” of the TED Developers Doc site (https://docs.ted.europa.eu/). - Where can we find the recording of this meeting?
Answer: It is available in the site of this meeting (https://op.europa.eu/en/web/ted-reusers-workshops/agenda-2023-10-18).
API - Incomplete documentation of the API. Long response time from the API (~10 seconds) and a lot of 500 (Internal Server Error) responses. But that is getting better the last week/days. Hopefully the API is stable after the Go-Live.
Answer: Please see the answer to question 44. Furthermore, and as the Gamma environment is just a test environment there may be some instability from time to time as it doesn’t have the same performance etc. requirements and capacity like a production environment. However, when TED 2.0 goes live, the production environment will be stable and performant enough. - Is there an API to completely download the HTML-Representation of for example: https://ted.europa.eu/udl?uri=TED:NOTICE:615672-2023:TEXT:DE:HTML
Answer: In TED 2.0 a user can download a notice using a direct link in the following formats: HTML, HTML download, Signed PDF, Non-signed PDF, XML. The format “HTML download” is a new feature which downloads the HTML representation of the notice. - Do you impose a rate limit for the Query/Render API? If yes, can it be increased, so that complex queries can be constructed outside (that is, combining results of subqueries to new result lists)?
Answer: Any usage or rate limits for TED 2.0 APIs will be defined in its fair usage policy that is under preparation. In TED 2.0 there is no visualization API. - Will the new search API work with old TED formats, too? Do you provide an automatic translation of fields (that is, two letter codes vs. three letter codes) automatically? Or is it necessary to specify a special form of question in order get the old format and a second one for the new format?
Answer: The new search API can search and retrieve both TED schema and eForms notices.
As regards the “two letter codes vs. three letter codes” part and if this concerns the code values for the code lists language and country, the query only accepts a code value in 3 letter codes (this applies both in the current TED website and the TED 2.0).
Applicable TED schema notices (where the code value is in a 2-letter code) will be included in the result list together with the applicable eForms notices (where the code value is in a 3-letter code). - Will there be an API that returns the list of available bulk notices packages?
Answer: No. - Why it there the notice rendering (visualization) API when we can directly download the notice as HTML or PDF?
Answer: Please see the answer to question 17. - Will the limits on the number of calls to APIs or websites be increased?
Answer: Please see the answer to question 39. - Can you please share the links to the swagger documentation for the search and the visualization APIs?
Answer: The link to the swagger page for the TED 2.0 search API in the reusers test environment is: https://api.gamma.tedv2.spikeseed.cloud/swagger-ui/index.html. The TED 2.0 will offer no visualization API. Please see also the answer to question 17. - There are limitations on your web servers (see “TED HTTP and FTP usage thresholds” in https://ted.europa.eu/TED/misc/news.do). We must get the HTML as quickly as possible, and these limitations are a problem. How can we get the HTML via direct links more quickly without being blocked? Answer: What is mentioned in the page referenced in the question concerns the current TED website. As regards TED 2.0, please see the answer to question 39.
SEARCH - Where can the XPATH definitions for the expert search types be found? Current documentation only lists the XPATH for the old TED documents, not the eForms.
Answer: There is no such documentation available currently for eForms. However, the documentation – information related to eForms that can be found in the TED Developer Docs website (https://docs.ted.europa.eu) facilitates the understanding of what are the eForms, what is the structure of an eForms notice, the coded data used etc. With this someone can see what information can be found where within an eForms notice XML file. - How could I search in the expert mode for cpv-code 71000000 but without including the subcodes (71200000, 71210000, 712220000, 71221000…. 71900000)? I want to search only the core code 71000000.
Answer: The following expert search query can be used:
classification-cpv IN (71000000) AND classification-cpv NOT IN (711* 712* 713* 714* 715* 716* 717* 718* 719* 712* 713* 714* 715* 716* 717* 718* 719*) - I understand, that in eForms, the change notices should take the NoticeTypeCode of the referenced notice for which the change is related to. So how can I filter in the expert search specifically for change notices? In the current TED and with the old format, I can filter change notices. However, I cannot filter specifically for change notices in eForms format, as they don't have the notice-typ corr.
Answer: To get the list of change notices (in eForms), the user has to execute a query where the value of the search field change-notice-version-identifier (alias: BT-758-notice) is not empty. Please find here an example: publication-number in (632933-2023, 605302-2023, 561966-2023) and change-notice-version-identifier=(*) - Where exactly is the list of expert-search-field to XPath in eForms? Example: What is the part of an eForms-XML, which is rendered as "CY" (country of the buyer)?
Answer: For the time-being, there is no list for this. Please also see the answer to question 46. - Regarding the notice-search-v3 API: What are the possible values for the field scope? In the past it was used with numbered values. Is there an overview on all possible values for all fields?
Answer: The expert search page gives the list of fields, the description, and examples. - Regarding the notice-search-v3 API: How can I identify the notice-type (previously described by field TD)? The new field notice-type is only giving me cn-standard. Is this an issue with the API, or the way I use it or are the Contracting Authorities providing the data in a wrong format?
Answer: The search field TD is still available to select the TED schema notices.
PDF - In Italy we have the need to archive only PDF/A files. Our eSender downloads through an automation the notices from eNotices, which are in PDF, not in PDF/A. If I am not wrong, notices published in TED are in PDF/A. Is it possible for an eSender to implement a service that downloads PDF/A from TED?
Answer: In TED 2.0 a user can download a notice using direct link in the following formats: HTML, HTML download, Signed PDF, Non-signed PDF, XML. How this can be automated or scripted is up to the reuser. The formats Signed PDF and Non-signed PDF are in the PDF/A-1A standard. - On the current TED, the Signed PDF (for eForms notices) is sometimes not available immediately at publication time in the morning but becomes available later during the day. Why is there this delay (apparently only for the Signed PDF, and only in some cases), and will this be resolved in the new TED?
Answer: This issue has been solved. Please also see the answer to question 16 (it concerns both the HTMLs and the PDFs).